|
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
When several pieces of an application are accessing tables for reading and updating, you want to ensure that the item you read and plan on updating does not get changed out from under you. The initial question suggested that the table lock was the only way to prevent other connections from reading and possibly updating the row. However, a table lock is a heavy handed approach which may cause problems with active applications. The initial assumption that other connections need to be prevented from reading the row is an invalid assumption. Because of multi version concurrency control (MVCC), what you see in a transaction is fixed for the duration of the transaction. A select and update in the same transaction will always be consistent. Other connections which select the same item for reading will also see a consistent value. The solution to providing proper row level locking is to consistently use the following protocol: BEGIN; SELECT ... FOR UPDATE; UPDATE ... WHERE ...; COMMIT;
The situation is this: a program fills up a large text file with many columns and we want to extract some of the columns and insert them into a table as quickly as possible. This is a case where you will want to use the best tool for the job. A perl script can be used to parse the text file into a smaller file with only the relevant data in a format compatible for the COPY command. Perl is fast with files. Then use the COPY command for bulk loading the table. COPY is fast at loading files into the database. The alternative was to have the perl script directly insert into the database. This was deemed too slow. However, the question was raised whether or not the inserts were being done in a single transaction. Using larger transactions would certainly speed up the direct inserts. If the amount of data is very large and there are indexes on the table being loaded, you may want to consider dropping the indexes before the load and recreating them afterward.
Another issue with bulk loading is triggers firing with each row inserted. If you are sure your data is trustworthy and already meets your referential integrity requirements, you can turn off triggers for the bulk load and turn them back on immediately afterward. You should not use this option when your data is not completely clean. The reltriggers field in the pg_class table contains the number of triggers active for each table. It can be set to 0 the disable the triggers, but will need to be reset to the proper number of triggers to have them re-enabled.
UPDATE "pg_class" SET "reltriggers" = 0 WHERE "relname" = 'tablename'; UPDATE pg_class SET reltriggers = ( SELECT count(*) FROM pg_trigger where pg_class.oid = tgrelid) WHERE relname = 'table name';
Temporary tables are stored as ordinary files. They have no special designations, like "heap". The special designations are not necessary because if your machine has enough RAM, they will usually already be cached in the kernel disk cache.
In version 7.1.3 the statement CREATE TABLE foo AS SELECT * from bar;created the table foo with OIDS. However, in 7.3.1 the same statement creates the table without OIDS. This change was deliberate per Tom Lane and due to the difficulty of maintaining the existing behavior. There were no complaints at the time of the change, however, there have been some discussions since. Bruce Momjian pointed out that the plain CREATE TABLE does create the table with OIDS, making the CREATE TABLE AS behavior inconsistent. Also mentioned were the discussions on pgsql-hackers concerning modifications of the CREATE TABLE statement to have WITH/WITHOUT OIDS which would affect this as well. In the end, Tom Lane suggested back-patching the fix, adding OIDS for CREATE TABLE AS, in 7.3.2, rather than waiting for 7.4 in order to minimize the exposure to the changed behavior.
Tracking down a misbehaving process and its query can be a bother. The context in which the query is running does give some clues as to what the errant query is, but you can also look directly at its postmaster and use the statistics gathering tools. One way to find the problem process is to look in pg_locks to see what process is holding the oldest open transaction number or which one has a large number of locks that other processes would wait for. Once you get the process id, you can use ps or pg_stat_activity to see what it is doing. Set stats_command_string = true in your postgresql.conf. Then get the process id. You should be able to see the query using pg_stat_activity. If stats_command_string is not set, then you could attach the debugger (usually gdb) to the postmaster process and print out debug_query_string.
|
||||||||||||||||||||||||||||||
Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in! Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009 |