|
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
PostgreSQL General Bits will be on vacation for the two weeks following this issue. Please enjoy past articles from the Archives. PostgreSQL General Bits will return with new features and a new issue on November 30.
There is a classic problem with wanting a list of details per master, but also wanting to designate one and only one as special. For example, in a simple mailing list you want to list many people's email addresses and designate one and only one of them as the moderator. There are several different issues with this exclusively true field. We will look at a couple of aspects of it: the general enforcement and then the practical update scenarios. The simple way to do restrict one row of a group to be true is to add a conditional unique index. Here is a sample table and data with the conditional unique index. create table maillist ( listname text, -- REFERENCES lists (listname), email text, moderator boolean, PRIMARY KEY (listname, email) ); create unique index moderator on maillist (listname) where moderator; insert into maillist values ('bugs', 'sam@iam.com','f'); insert into maillist values ('bugs', 'elle@what.com','f'); insert into maillist values ('bugs', 'xandy@where.com','t'); insert into maillist values ('support', 'help@bigbus.com','f'); insert into maillist values ('support', 'target@zero.com','t'); insert into maillist values ('support', 'frank@subtle.com','f'); insert into maillist values ('dev', 'hqr@zoneout.com','t'); insert into maillist values ('dev', 'lphant@zoo.com','f'); insert into maillist values ('dev', 'zardoz@who.com','f'); With this conditional unique index, you cannot insert or update a second row with a value of true for moderator. db=# insert into maillist values ('bugs', 'spider@web.com', 't'); ERROR: duplicate key violates unique constraint "moderator" db=# update maillist set moderator='t' where email='help@bigbus.com' and listname='support'; ERROR: duplicate key violates unique constraint "moderator"However, you can eliminate the moderator altogether by deleting the moderator or setting the moderator's value to false. db=# delete from maillist where email='xandy@where.com' and listname='bugs'; DELETE 1 db=# update maillist set moderator='f' where email='target@zero.com' and listname='support'; UPDATE 1 Moderator changes are addressed by issuing the two statements that achieve the effect that you want. If you want to be able to swap moderators, you can set one to false and then set the other to true. If a new person were added as a moderator, update the old one to false and add the new one to true. update maillist set moderator='f' where email='hqr@zoneout.com' and listname = 'dev'; update maillist set moderator='t' where email='lphant@zoo.com' and listname = 'dev'; If you wanted a change of moderators in one statement, you could have an UPDATE and INSERT trigger which, when the moderator field were set to true in the NEW row, it would first set the old one to false. The triggers can be overkill if you can handle the two step logic in your application. Whether to add this feature depends on whether you want the changing of moderators to be "invisible" or whether you want to handle the error condition. Both approaches are valid and have their merits and shortcomings. Below is the trigger function for INSERT or UPDATE. CREATE OR REPLACE FUNCTION insupd_maillist( ) RETURNS TRIGGER as ' BEGIN IF NEW.moderator=''t'' THEN UPDATE maillist SET moderator=''f'' WHERE moderator=''t'' AND listname=NEW.listname; END IF; RETURN NEW; END; ' language 'plpgsql'; CREATE TRIGGER insmaillist BEFORE INSERT ON maillist FOR EACH ROW EXECUTE PROCEDURE insert_maillist(); The last issue is whether or not you require a moderator. If it is not acceptable to have no moderator, then the index alone is not enough. You will need to catch the update and deletes where the moderator field is set to from true to false and raise an error. The details of this last issue are left as an exercise to the reader.
The topic of using a trigger to count rows was discussed in Article #49-1. A subsequent question was raised regarding the effect of multiple triggers on the logic of the counter. The question was whether it was possible for one trigger to increment the counter and then have the subsequent trigger abort the original statement. When a statement is executed one or more triggers are also executed to implement foreign key integrity, additional functionality such as the row counter and also CHECK clauses. All of the triggers are executed in the context of the transaction of the original statement. They are executed (arbitrarily) in alphabetical order. If any one of the triggers cause the original statement to be aborted, then all of the triggers' actions will also be rolled back. Everything is in the same transaction. If the trigger for counting succeeded and a subsequent trigger rolled back the transaction, then the effects of the counting trigger would also be rolled back.
For those who haven't heard yet, there is now a German "general" mailing list at postgresql.org Für alle, die es noch nicht gehört haben, es gibt jetzt eine deutsche Mailingliste für "allgemeine" Themen unter postgresql.org. Zum Eintragen, schreibt eine E-Mail mit "subscribe" im Text an pgsql-de-allgemein-request@postgresql.org.
Before getting into the details of any EXPLAINs, remember you must VACUUM ANALYZE in order to enable to optimizer to make the proper choices. Without VACUUM ANALYZE query optimization is haphazard at best. The next thing to know is that EXPLAIN ANALYZE gives you better information that simply EXPLAIN. It gives you the optimizers best guess followed by the actual cost of running the query. Just as an average is a useful aggregate, it is more useful if you have the average, the median and the standard deviation to qualify the average. So EXPLAIN ANALYZE gives the a better context for judging the query. To begin to understand EXPLAIN, it is helpful to look at small test cases and understand them first before moving on to more complex queries. For this example table and two queries, the question is why one uses a sequential scan and the other uses an index.
Table "public.abcs" Column | Type | Modifiers -----------+--------------+----------------------------------- abcid | integer | not null default nextval('abcid_seq'::text) type | varchar(255) | versionof | integer | Indexes: abcs_pkey primary key btree (abcid), abcs_versionof btree (versionof) planb=# explain select type from abcs where abcid = 6339; QUERY PLAN ----------------------------------------------------------------------- Index Scan using abcs_pkey on abcs (cost=0.00..6.01 rows=1 width=145) Index Cond: (abcid = 6339) (2 rows) planb=# explain select type from abcs where versionof = 6339; QUERY PLAN ---------------------------------------------------------------- Seq Scan on abcs (cost=0.00..59182.10 rows=16137 width=145) Filter: (versionof = 6339) (2 rows) The first query sees the unique primary key index on column abcid and understands that it will return one and only one row. In this case it is best to use an index if at all possible. In the second case the optimizer determines that there will be about 16000 rows returned. It determines that in this case a sequential scan would be faster. In particular cases a sequential scan is faster than an index scan. Those cases primarily involve selection of a larger percentage of rows of the table.
Clients which do not explicitly support PostgreSQL (and some that do!) may have trouble using Set Returning Functions (SRFs). In MS Query, the following statement using an SRF in the FROM clause returns the error that there is no table called getStudents(). Obviously. SELECT * FROM getStudents(); Clients rely on the system catalog to see what tables exist and what their specifications are. When using SRF, a row type is declared to tell PostgreSQL the specifications of the returned row, but this information is not comprehended by MS Query. To work around this problem, create a dummy table for the rowtype that the SRF returns. Then create a RULE on the table which substitutes a SELECT of the table with a SELECT of the function. For more information about Set Returning Functions, see Issue #26 of PostgreSQL General Bits.
|
||||||||||||||||||||||||||||||
Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in! Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009 |