Varlena, LLC | PostgreSQL General Bits Newsletter

PostgreSQL Training,
Consulting & Support

By A. Elein Mustain

20-Jan-2003 Issue: 10

Archives |

General Tidbits |

Google General Bits |

Prev |

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

Editorial Inquiry
	26-Jan-2003

General Bits has been running now for 10 weeks. That is 60 items from more than 300 pgsql-general messages in 10 issues. It seems that this is a good time to invite specific feedback on the column.

I will ask you not to fill out a form, but to send me email. Here are some questions you could answer or you could write what you think will be helpful.

Is the technical level too high or too low or a good mix?
Are the thread topics selected pertinent to what you need to know?
Are you learning about things you would not have otherwise researched?
Is the format and language easy to read?
Is the format too long? too short? Too many items? too few?
What improvements or changes could you suggest?
Do you read the column each week? Have you read all of the columns?
Have you used the google search? Did it work for you?

I will only use this information to improve my column and not distribute it to anyone else, except in anonymous summary form.

Editor: elein@varlena.com

Trimming unneeded spaces
[GENERAL] varchar and spaces problem.. 14-Jan-2003

With input from varied sources sometimes you cannot arrange it to be optimal for input. In this case, values were being inserted into varchar() fields with spaces.

The initial solution was to create a rule to trim the values.

	create rule rl_testins as on insert to test do
	update test set a=trim(trailing ' ' from a) where a=old.a

This works correctly but it was deemed to be slow. Some suggestions were made to review the general insert speed and to modify the input stream.

The best solution came from Stephan Szabo who noted that when updates are made with rules, then it is likely that the newly created row will be re-selected, changed and re-written. Instead of a rule use a trigger, and then the timing will be more optimal. With a before trigger, the changes will happen before the write.

	create function trima() returns trigger as ' begin
 		NEW.a := trim(trailing '' '' from NEW.a);
 		return NEW;
	end;' language 'plpgsql';

	create trigger testtrig before insert on test 
		for each row execute procedure trima();

Contributors: Fabrizio Mazzoni veramente at libero.it, Richard Huxton dev at archonet.com, Stephan Szabo sszabo at megazone23.bigpanda.com

Perfect Vacuuming
[GENERAL] Vacuum verbose output? 14-Jan-2003

Understanding the verbose output of vacuum is helpful as you try to carefully balance inserts, deletes and vacuum. The ideal rhythm you are looking for would show table sizes reflecting only the growth of new inserts, not bloating with defunct rows. Achieving balance is particularly interesting in the case in question where the data was primarily write-once-read-often where a tuple stays active for about 3 hours before being deleted. Several tables are involved, including one large one, which have varying query performance requirements.

Example of VACUUM output; the first longer entry is from VACUUM FULL:

	NOTICE:  Pages 371: Changed 3, reaped 364, Empty 0, New 0; Tup 8180: Vac
	1786, Keep/VTL 0/0, UnUsed 6285, MinLen 115, MaxLen 648; Re-using:
	Free/Avail. Space 622460/621672; EndEmpty/Avail. Pages 0/350.
	NOTICE:  Pages 342: Changed 2, Empty 0; Tup 8747: Vac 609, Keep 0, UnUsed 5350.
	NOTICE:  Pages 342: Changed 1, Empty 0; Tup 8443: Vac 732, Keep 0, UnUsed 5572.
	NOTICE:  Pages 342: Changed 0, Empty 0; Tup 8325: Vac 838, Keep 0, UnUsed 5614.
	NOTICE:  Pages 342: Changed 4, Empty 0; Tup 8197: Vac 868, Keep 0, UnUsed 5729.
	NOTICE:  Pages 353: Changed 1, Empty 0; Tup 9159: Vac 413, Keep 0, UnUsed 5508.
	NOTICE:  Pages 354: Changed 1, Empty 0; Tup 8780: Vac 765, Keep 0, UnUsed 5552.

Vacuum Output Key

Tup Number of tuples remaining after vacuum

Vac Number of tuples removed by vacuum

Keep Number of almost dead tuples for transactions where they are still visible

Used Number of unused item pointers

Pages The number of files on the system involve in this table.

Changed The number of pages modified.

Reaped The number of blank pages, ready for reuse

Empty The number of blank pages, ready for reuse

New Pages created as a result of vacuuming

(It was pointed out by Manfred Koizar that pages "Changed" actually only signifies the number of pages changed due to heap tuple header updates. It was unclear whether or not this was intentional.)

Further analysis and discussion raised the issue of FSM settings. max_fsm_relations and max_fsm_pages need to be set proper values. In the case in question, the values were set to the defaults, 100 and 10,000 respectively.

In the next release max_fsm_relations will default to 1000 which is a better choice. The large table used 100K pages. This led to the estimate by Tom Lane of 1,000,000 as a possible value for max_fsm_pages. Tom wrote, "You need an FSM slot for each page that has useful free space on it, else the system will forget about that free space." An upper bound is

	select sum(relpages) from pg_class where relkind in ('r','t');

summed over all databases in your installation. This value is usually high.

Increasing the FSM values would affect the installation by increasing shared memory usage. But it will also keep the system from leaking free space in tables, causing table bloat. With proper FSM settings, it should be possible to time VACUUM to maintain stable and trackable tables sizes.

Contributors: Julian Scarfe julian.scarfe at ntlworld.com, Robert Treat xzilla at users.sourceforge.net, Manfred Koizar mkoi-pg at aon.at, Tom Lane tgl at sss.pgh.pa.us

Integers, int2, int4 and implicit casting of constants
[GENERAL] Functions just dont want to work! [hard] 04-Dec-2003

In 7.2 a function foo was declared to take an int2 parameter:

	create function insert_new_field( int2, text, text ) returns text as '....';

After upgrading to 7.3, it was called like this:

	select insert_new_field(24,'stuff','stuff' );

And gave the error message:

	Warning: PostgreSQL query failed: 
	ERROR: Function insert_new_field(integer, "unknown", "unknown") does not exist 
	Unable to identify a function that satisfies the given argument types 
	You may need to add explicit typecasts.

This happened because in 7.3 there are a few problems with number conversions. In this case, the constant 24 is parsed as an integer (int4). The function manager failed to coerce the int4 down to an int2 in order to match the parameter list of the function. In 7.2, the parameter was coerced correctly to match the function's argument list.

Tom Lane says this is an temporary problem (Ed. I think he means it is a bug :-) and is working on a solution that does not affect other existing code. Implicit casting is a very delicate and tricky feature.

The workaround is to either redefine the function to take an integer or to cast the constant to an int2 when it is called.

Contributors: Yonatan Ben-Nes da at canaan.co.il, Joel Burton joel at joelburton.com, Andrew J. Kopciuch akopciuch at bddf.ca, Tom Lane tgl at sss.pgh.pa.us

Abstime and epoch casts
[GENERAL] Frustration with date/times/epoch in v7.3. 12-Dec-02

This item will sound a lot like the previous item and indeed they are related. This query run in 7.2.1 took an epoch time and converted it to a timestamptz:

	select imported_date, "timestamptz"(imported_date) 
	from server_accounts limit 1;

	 imported_date |      timestamptz
	---------------+------------------------
	    1037498593 | 2002-11-16 18:03:13-08
	(1 row)

In 7.3, after several other frustrating tries, the following ugly query could work:

	select imported_date,
	('Jan 1 1970'::DATE + (imported_date || ' seconds')::INTERVAL)::timestamp
	from server_accounts limit 1;

	 imported_date |      timestamp
	---------------+---------------------
	    1027966107 | 2002-07-29 18:08:27

Peter Gibbs suggested the following improvement using abstime:

	 select imported_date, abstime(imported_date) from server_accounts;

	 imported_date |        abstime
	---------------+------------------------
	    1037498593 | 2002-11-17 04:03:13+02
	    1027966107 | 2002-07-29 20:08:27+02

This might better be written as "timestamp"(abstime(imported_date) ) to get the cast up to the timestamp type.

Using abstime makes one a little nervous because it is threatened to be deprecated and disappear. The documentation also notes that abstime is less precise than timestamp. Although the documentation does clearly threaten disappearance, it is doubtful that abstime will disappear anytime soon.

The way to handle epoch dates is to either explicity cast them as done in the selection above or to isolate the cast in a function:

	create function timestamp(integer) returns timestamp as
	'select "timestamp"($1::abstime);'
	language 'sql';

Returning to the issue of implicit casts, Stephan Szabo pointed out that 7.2.1 was using abstime. In 7.2.1 implicit coersion was a little bit freer than in 7.3. In the prior version, the number was implicitly converted to abstime but in the later versions, it is not.

Contributors: Mike Benoit mikeb at netnation.com, Peter Gibbs peter at emkel.co.za, Stephan Szabo sszabo at megazone23.bigpanda.com, Jean-Luc Lachance jllachan at nsd.ca

Matching tables to physical files
[GENERAL] Where are my tables physically in the fs? 16-Dec-02

Tables are stored in files in the $PGDATA/data directory under their object id names. In the releases earlier than about 7.2, the files were named by their table name.

If you want to know which file corresponds to which table you must match the table name to its OID. The filename is also in pg_class.relfilenode. Using refilenode rather than OID is better because it is clearer.

In contrib/oid2name there is a utility program which will enable you to see the OID of databases, tables and to see the object name of a specified OID.

Alternatively, you can simply select from pg_class either by table name or by file name.

	select relname, relfilenode from pg_class where relname = 'tablename';
	select relname, relfilenode from pg_class where refilenode = 'OID';

Contributors: Egyud Csaba csegyud at freemail.hu, Lincoln Yeoh lyeoh at pop.jaring.my, Medi Montaseri medi.montaseri at intransa.com

Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!

Top