Varlena, LLC | PostgreSQL General Bits Newsletter

PostgreSQL Training,
Consulting & Support

By A. Elein Mustain

20-Jan-2003 Issue: 9

Archives |

General Tidbits |

Google General Bits |

Prev |

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

Row Locking
[GENERAL] lock table question	30-Dec-02

When several pieces of an application are accessing tables for reading and updating, you want to ensure that the item you read and plan on updating does not get changed out from under you.

The initial question suggested that the table lock was the only way to prevent other connections from reading and possibly updating the row. However, a table lock is a heavy handed approach which may cause problems with active applications.

The initial assumption that other connections need to be prevented from reading the row is an invalid assumption. Because of multi version concurrency control (MVCC), what you see in a transaction is fixed for the duration of the transaction. A select and update in the same transaction will always be consistent. Other connections which select the same item for reading will also see a consistent value.

The solution to providing proper row level locking is to consistently use the following protocol:

	BEGIN;
	SELECT ... FOR UPDATE;
	UPDATE ... WHERE ...;
	COMMIT;

Contributors: Andy Kriger akriger at greaterthanone.com, Doug McNaught doug at mcnaught.org, scott.marlowe scott.marlowe at ihs.com, Manfred Koizar mkoi-pg at aon.at, Tom Lane tgl at sss.pgh.pa.us, Mike Mascari mascarm at mascari.com

Optimize file input
[GENERAL] Inserting streamed data 31-Oct-02

The situation is this: a program fills up a large text file with many columns and we want to extract some of the columns and insert them into a table as quickly as possible.

This is a case where you will want to use the best tool for the job. A perl script can be used to parse the text file into a smaller file with only the relevant data in a format compatible for the COPY command. Perl is fast with files. Then use the COPY command for bulk loading the table. COPY is fast at loading files into the database.

The alternative was to have the perl script directly insert into the database. This was deemed too slow. However, the question was raised whether or not the inserts were being done in a single transaction. Using larger transactions would certainly speed up the direct inserts.

If the amount of data is very large and there are indexes on the table being loaded, you may want to consider dropping the indexes before the load and recreating them afterward.

Contributors: Csaba Nagy nagy at domeus.de, David Blood david at matraex.com, Kevin Old kold at carolina.rr.com, Greg Patnude GPatnude at adelphia.net, Doug McNaught doug at mcnaught.org

Turn off triggers for bulk load
[GENERAL] Turning off triggers ? 25-Nov-02

Another issue with bulk loading is triggers firing with each row inserted. If you are sure your data is trustworthy and already meets your referential integrity requirements, you can turn off triggers for the bulk load and turn them back on immediately afterward. You should not use this option when your data is not completely clean.

The reltriggers field in the pg_class table contains the number of triggers active for each table. It can be set to 0 the disable the triggers, but will need to be reset to the proper number of triggers to have them re-enabled.

	UPDATE "pg_class" SET "reltriggers" = 0 WHERE "relname" = 'tablename';

	UPDATE pg_class SET reltriggers = (
		SELECT count(*) FROM pg_trigger where pg_class.oid = tgrelid)
		WHERE relname = 'table name';

Contributors: Glen Eustace geustace at godzone.net.nz, Stephan Szabo sszabo at megazone23.bigpanda.com, Adam Witney awitney at sghms.ac.uk, Jean-Luc Lachance jllachan at nsd.ca

Temporary tables
[GENERAL] temporary tables on hd or in memory?	17-Jan-2003

Temporary tables are stored as ordinary files. They have no special designations, like "heap". The special designations are not necessary because if your machine has enough RAM, they will usually already be cached in the kernel disk cache.

Contributors: Stefan Scherf scherf at holyhome.de, Tom Lane tgl at sss.pgh.pa.us

CREATE TABLE AS with OIDS
[GENERAL] change in behavior? Is this a regression in function? 17-Jan-2003

In version 7.1.3 the statement

	 CREATE TABLE foo AS SELECT * from bar;

created the table foo with OIDS. However, in 7.3.1 the same statement creates the table without OIDS.

This change was deliberate per Tom Lane and due to the difficulty of maintaining the existing behavior. There were no complaints at the time of the change, however, there have been some discussions since. Bruce Momjian pointed out that the plain CREATE TABLE does create the table with OIDS, making the CREATE TABLE AS behavior inconsistent. Also mentioned were the discussions on pgsql-hackers concerning modifications of the CREATE TABLE statement to have WITH/WITHOUT OIDS which would affect this as well.

In the end, Tom Lane suggested back-patching the fix, adding OIDS for CREATE TABLE AS, in 7.3.2, rather than waiting for 7.4 in order to minimize the exposure to the changed behavior.

Contributors: Gavin Atkinson gavin.atkinson at ury.york.ac.uk, Tom Lane tgl at sss.pgh.pa.us, Bruce Momjian pgman at candle.pha.pa.us

Tracking down rogue processes
[GENERAL] pg_stat_get_backen_last_activity() ??? 17-Jan-2003

Tracking down a misbehaving process and its query can be a bother. The context in which the query is running does give some clues as to what the errant query is, but you can also look directly at its postmaster and use the statistics gathering tools.

One way to find the problem process is to look in pg_locks to see what process is holding the oldest open transaction number or which one has a large number of locks that other processes would wait for. Once you get the process id, you can use ps or pg_stat_activity to see what it is doing.

Set stats_command_string = true in your postgresql.conf. Then get the process id. You should be able to see the query using pg_stat_activity. If stats_command_string is not set, then you could attach the debugger (usually gdb) to the postmaster process and print out debug_query_string.

Contributors: Sean Chittenden sean at chittenden.org, Tom Lane tgl at sss.pgh.pa.us

Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!

Top