PostgreSQL Training,
Consulting & Support
General Bits
By A. Elein Mustain

5-May-2003 Issue: 24

Archives | General Tidbits | Google General Bits | Docs | Castellano | PortuguÍs | Subscriptions | Notifications | | Prev

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see

Optimizing OR'd Queries
[GENERAL] Can I Benefit from and Index Here?
[GENERAL] Left Join Not Using Index?

Here is a slow query which needs some help. It looks like this:

	SELECT DISTINCT message_board_topics.rec_num,
		message_board_topics.topic_name, message_board_topics.topic_body,
		message_board_topics.topic_author, message_board_topics.topic_author_email,
		message_board_topics.number_of_comments, to_char(topic_date, 'MM.DD.YYYY')
		as formatted_date 
	FROM message_board_topics left join message_board_comments 
		on (message_board_comments.topic_id=message_board_topics.rec_num) 
		upper(topic_name) LIKE upper('madbrowser') OR 
		upper(topic_body) LIKE upper('madbrowser') OR 
		upper(topic_author) LIKE upper('madbrowser') OR
		upper(message_board_comments.comment_author) LIKE upper('madbrowser')
	ORDER BY message_board_topics.rec_num DESC;

There are a couple of obvious things to start off with. We assume there are already indexes on the join columns. So the first is to change LIKE to = since no partial match is needed.

The next thing to do is to run EXPLAIN ANALYZE. EXPLAIN ANALYZE shows us that the query is doing a sequential scan. By turning off enable_seqscan we try EXPLAIN ANALYZE again. This is what we get.

   With enable_seqscan = false:
   Total runtime: 40855.59 msec

   With enable_seqscan = true:
   Total runtime: 13108.33 msec
The planner is choosing the sequential scan because it is faster. "(Hot diggety ;-))", says Tom Lane. Unfortunately the query is still slow. Another idea is to provide functional indexes on upper(topic_name), etc. That could certainly help, but the problem really is the multitude of ORs in the statement.

It is possible to divide out the ORs by using a UNION. It might look like this (with a minimal targetlist).

   select message_board_topics.rec_num 
      from message_board_topics 
      where upper(topic_name) LIKE upper('madbrowser')
      select message_board_topics.rec_num 
         from message_board_topics 
         where upper(topic_body) LIKE upper('madbrowser')
      select message_board_topics.rec_num 
         from message_board_topics 
         where upper(topic_author) LIKE upper('madbrowser')
      select message_board_topics.rec_num 
         from message_board_topics,message_board_comments 
         where message_board_comments.topic_id=message_board_topics.rec_num
            and upper(message_board_comments.comment_author) LIKE upper('madbrowser')
   order by 1 desc;
This query was pronounced "MUCH faster than my old query...".

However, Stephan Szabo noted, "the best solution is probably some sort of full text indexing solution. Putting the keywords from the various columns you want to index along with the rec_num (or topic_id) of the row and an index on the text. Then you could join message_board_topics with that and probably get a much better plan."

Contributors: Hunter Hillegas lists at, Stephan Szabo sszabo at, Dann Corbit DCorbit at, Tom Lane tgl at, Lincoln Yeoh lyeoh at, Richard Huxton dev at, Ken Williams ken at, scott.marlowe scott.marlowe at, Dennis Gearon gearond at
Multi-column UPDATES with aggregates
[GENERAL] ERROR: ExecutePlan: (junk) `ctid' is NULL! 29-Apr-2003

The following SQL causes the obscure error message of:

	ERROR:  ExecutePlan: (junk) `ctid' is NULL!.
	UPDATE Tsummary
		SET participants = count(distinct credit_id), 
		teams = count(distinct team_id)
	FROM email_contrib_today ect
	WHERE ect.project_id = projectid
The underlying complaint of the ERROR is that the aggregate count() is not in the context of a SELECT. Having an aggregate at the top level of an UPDATE statement is illegal per the SQL spec.

There are some alternatives. Either you can use an UPDATE with SELECT or you can use pl/pgsql. In both cases, you preselect the aggregate and assign it in the UPDATE statement.

The UPDATE SELECT could be done either field by field or with one SELECT statement. The first statement is field by field. The second shows how to select more than one value from the SELECT statement.

	UPDATE Tsummary
		SET participants = 
			(select count(distinct credit_id) 
		 	from email_contrib_today ect 
		 	where etc.project_id = projectid),
		teams =
			(select count(distinct team_id) 
		 	from email_contrib_today ect 
		 	where etc.project_id = projectid)

	UPDATE Tsummary
		SET participants = f.ccount, teams = f.tcount
		FROM (SELECT count(distinct credit_id) as ccount,
				count( distinct team_id) as tcount
			FROM email_contrib_today
			WHERE etc.project_id = projectid) f

Contributors: Jim C. Nasby jim at, Tom Lane tgl at, Dennis Gearon gearond at
Things to consider when allocating space for tables and indexes
[GENERAL] Estimating space required for indexes 28-Apr-2003

For amounts of data is is important to know how much storage to anticipate. This information should help guide you in estimating the proper amount of storage for your system in terms of row and index memory usage.

For the sake of discussion we will follow the example in the thread which has 3 float4 columns per row. Estimating column size is a whole other topic.

A minimal row does not have OIDs or nulls. Assuming intel-like alignment rules and that the column widths are fixed multiples of four a row will have:

  • 28 bytes overhead +
  • 3 columns * 4 bytes =
  • 40 bytes per row.
The index for this table is on the first two columns. The index tuple will have:
  • 12 bytes overhead +
  • 2 * 4 bytes data =
  • 20 bytes per row.

Therefore 80M of rows takes about 32MB while its indexes will take up 16MB. However, there are some wishy-washy factors to take into account (besides columns with varying data types).

The creation of an index takes more space than the index itself, but it should not take more than twice the size of the finished index. This is transient in that it only occurs when you create the entire index.

Data is usually stored on heap pages which are crammed as full as possible. Indexes are stored on b-tree index pages which are initially 65% to 75% full, following the standard theory of the best configuration for b-tree. (Full b-tree pages for non-static tables will cause over splitting causing destruction of the physical ordering which in turn negates the benefit of saving a few index pages.)

Contributors: Shridhar Daithankar shridhar_daithankar at, Tom Lane tgl at, Oleg Bartunov oleg at
Fixed Size Column Storage
[GENERAL] fixed size columns 29-Apr-2003

Each column of any type is begun fresh on a new 4 byte boundary (or possibly 8 depending on your architecture).

Fixed and variable length character columns are stored as varlenas, that is, with a four-byte length followed by the data. This is the most flexible method of storage as it supports both single and multi-byte character sets.

The storage size of any fixed size text-like column is

	4 + size of col + ( 4 - ( size of col % 4) ) 
This following table was defined with all fixed size character types. It was defined using fixed sizes because the data was truly fixed size.
mdcdb=# \d codes
              Table "codes"
     Attribute      |         Type   | storage size
 rpt_code           | character(5)   | 4 +  5 + (4 - (5 % 4) ) = 12
 xcd_code           | character(5)   | 4 +  5 + (4 - (5 % 4) ) = 12
 mod_1              | character(2)   | 4 +  2 + (4 - (2 % 4) ) = 8
 mod_2              | character(2)   | 4 +  2 + (4 - (2 % 4) ) = 8
 carrier            | character(5)   | 4 +  5 + (4 - (5 % 4) ) = 12
 state_code         | character(2)   | 4 +  2 + (4 - (2 % 4) ) = 8
 specialty_code     | character(2)   | 4 +  2 + (4 - (2 % 4) ) = 8
 par                | character(1)   | 4 +  1 + (4 - (1 % 4) ) = 8
 type_code          | character(1)   | 4 +  1 + (4 - (1 % 4) ) = 8
 place_code         | character(2)   | 4 +  2 + (4 - (2 % 4) ) = 8
 submitted_charge   | character(13)  | 4 + 13 + (4 - (13 % 4)) = 20
The column lengths total 112 bytes. The row also has an overhead of 28 bytes so the row size is 140. This is assuming the row does not have OIDs or nulls. It is interesting to note that the table could have been defined with all columns of type text and still maintain the same storage size. It would, however, lose any blank padding gained by the character type.

Contributors: dmreese at, elein at, Tom Lane tgl at, Manfred Koizar mkoi-pg at
Changing only one of two equal rows
[GENERAL] select first occurrence of a table 2-May-2003

If you have got a set of duplicate rows and want to update only one of them, you cannot use the usual update statement because both will be changed. (We will not go into why having duplicate rows is bad and wrong here :-) and just examine the problem in question. Obviously the data was bad and we just want to correct it.)

There are two fields that can uniquely identify the rows, the OID and the CTID. By selecting the rows' OIDs (or CTIDs) out you can choose one and update it.

	SELECT oid FROM table WHERE ... LIMIT 1;
	UPDATE table SET ... WHERE oid = ...;

This doesn't really need to be done in two queries, though. It can be done using a simple update from a selection like this.

	UPDATE foo SET field1 = 'ZZZ' WHERE oid =
		(SELECT min(oid) FROM foo WHERE field2 = 'XXX');
Note, of course, that max(oid) would work as well.

Contributors: Johnson, Shaunn SJohnson6 at, scott.marlowe scott.marlowe at, Erik Ronström kvarken at, Tom Lane tgl at, Dennis Gearon gearond at, nolan at
Appending to an Array
[GENERAL] Appending to an array 15-Apr-2003

In response to the question of how to append to an array of unknown size, Joe Conway came up with this nice little function return the next index into an array. This then is used to update the array. This script requires 7.3.

	CREATE OR REPLACE FUNCTION array_next(text[]) returns int AS '
		arr alias for $1;
		high int;
			high := 1 +
			RETURN high;

	create table mytable (myarray text[]);
	insert into mytable values ('{"abc","d e f"}');
	update mytable set myarray[array_next(myarray)] = 'new element';
	regression=# select * from mytable;
  	{abc,"d e f","new element"}
	(1 row)
This is how the function works, from the inside out.
	1 +
array_dims gives you the string description of the array. It had two elements in it.
 select array_dims(myarray) from mytable; 
	--> [1:2] 
We split that on the : returning the second field.
 select split_part(array_dims(myarray),':',2) from mytable;
	--> 2]
The ] is replaced with an empty string.
	--> 2
And we cast the remaining 2 to an integer. Back at the top we add 1 to it, getting 3. phew! Using the result of this function as an index into the array we can do the update.
	update mytable set myarray[array_next(myarray)] = 'new element';

It was also noted that in contrib/intarray are a large set of operators that can perform even more operations on integer arrays.

Contributors: Jay O'Connor joconnor at, Oleg Bartunov oleg at, Joe Conway mail at

Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!
Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009

Search General Bits & Search WWW