Varlena, LLC | PostgreSQL General Bits Newsletter

PostgreSQL Training,
Consulting & Support

By A. Elein Mustain

17-Mar-2003 Issue: 17

Archives |

General Tidbits |

Google General Bits |

Prev |

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

Parsing of Quoted Constants
[GENERAL] Unexpected parse behaviour for date to timestamp	07-Mar-2003

An explanation was sought for the fact that the following queries are parsed differently depending on whether the constant value is quoted or not. The query is qualifying on submit, a timestamp column.

	# select submit from invoices where submit <= 2003-03-09;
	  submit
	----------
	(0 rows)
	
	# select submit from invoices where submit <= '2003-03-09';
	  submit
	---------------------
	  2003-03-08 00:24:25
	(1 row)

This is obvious to those of us who sometimes use psql as a calculator, The first query is selecting for a date less than or equal to the integer 1991 or ( 2003 - 03 - 09 ) while the second is selecting for a date earlier than 9-Mar-2003.

The parser does its best to try to evaluate the data type of arguments and in the first case it found a timestamp and an integer. Since there is no conversion from a timestamp to an integer, it settled on a comparison of submit::text and 1991::text which did not find anything. In the second query, the argument '2003-03-09' was left untyped until the context required a type. This context a comparison to a timestamp so that is the first conversion tried on the constant value. If for some reason, it did not succeed, it may try others as it did in the first query.

This issue was addressed in a slightly different context in PostgreSQL GeneralBits Issue #12 Casting for bigint Queries.

Contributors: Jean-Christian Imbeault jc at mega-bucks.co.jp, Stephan Szabo sszabo at megazone23.bigpanda.com, Tom Lane tgl at sss.pgh.pa.us

Top of the category
[GENERAL] Difficult query 05-Mar-2003

There is a table with two columns of identifying information and one or more values associated with them. For example a table of student exercises in a particular area. The area and student are the identifying columns and the exercises columns are the values associated with them.

	elein=# \d student_stats
	   Table "public.student_stats"
	      Column      |  Type   | Modifiers
	------------------+---------+-----------
	 area             | text    |
	 student          | text    |
	 day              | date    |
	 exercises_tried  | integer |
	 exercises_solved | integer |

What we want to know is the query that shows the top student in each area for a specified date interval. The simple summation only gets us so far. It will give us each person's total score and order the result so that the top student is the top row in each category, but the other rows are not really wanted.

	 select area, student, 
	        sum(exercises_tried) as total, sum(exercises_solved) as succ
	        from student_stats
	        where day >= (now() - '14 day'::interval)::date
	        group by area, student
	        order by area asc, succ desc, total desc;
	    area    | student | total | succ
	------------+---------+-------+------
	 arithmetic | Sarah   |    40 |   40
	 arithmetic | Lucy    |    40 |   38
	 ...more rows...
	 arithmetic | Paul    |    50 |   34
	 arithmetic | Ramon   |    20 |   14
	 geography  | Paul    |    50 |   40
	 geography  | Sarah   |    40 |   40
	 ...more rows...
	 geography  | Xavier  |    39 |   37
	 geography  | William |    40 |   34
	 spelling   | William |    40 |   34
	 spelling   | Lucy    |    38 |   34
	 ...more rows...
	 spelling   | George  |    34 |   34
	 spelling   | Sarah   |    50 |   33
	(27 rows)

My first guess at resolving this problem was to use an outer query and select the max of total grouped by area. However, Manfred Koizar provided a much cleaner and nicer solution using DISTINCT in the outer query. This distinct relies on the sort and grouping order of the inner query so that the first record for an area is the one found first.

	select distinct on (area) area, student, total,succ
	from (
	    select area, student, 
	        sum(exercises_tried) as total, sum(exercises_solved) as succ
	        from student_stats
	        where day >= (now() - '14 day'::interval)::date
	        group by area, student
	        order by area asc, succ desc, total desc) foo
	order by area, succ desc, total desc;

Contributors: Peter Alberer h9351252 at obelix.wu-wien.ac.at, Manfred Koizar mkoi-pg at aon.at

Time interval "casting": A small clarification
[GENERAL] type casting a subselect as an interval 21-Feb-2003 21-Feb-2003

In Issue #15 the topic of creating intervals was discussed and I gave the syntax for creating an interval as this:

	interval [(]  textstring [[)] 
	or
	textstring::interval

It has come to my attention that the parentheses are not accepted and the proper syntax is:

	interval textstring
	or  
	textstring::interval

Contributors: elein at varlena.com

How was PostgreSQL configured?
[GENERAL] How to find out what options postgresql was installed 12-Mar-2003

If you are not the one who installed postgresql on your machine, you may need some way to find out how the installation was configured. You can always hack down to the PGDATA directory and take a direct look at postgresql.conf, but system administrators frown on that kind of thing.

The utility pg_config will show what options were used to configure your installation.

	pg_config --configure

This is a nice little utility which can also show you most of the crucial information for locating the parts of your installation. The options to pg_config are:

	  --bindir              show location of user executables
	  --includedir          show location of C header files of the client
	                        interfaces
	  --includedir-server   show location of C header files for the server
	  --libdir              show location of object code libraries
	  --pkglibdir           show location of dynamically loadable modules
	  --configure           show options given to 'configure' script when
	                        PostgreSQL was built
	  --version             show the PostgreSQL version, then exit
	  --help                show this help, then exit

The bindir is necessary so that it can be included in PATH. The includedir and includedir-server are necessary for development of C applications for the client or C server functions. libdir and pkglibdir are often the same value. They, together, hold postgresql runtime shared libraries and user defined shared libraries for C server functions.

Contributors: Flower Sun sun_2002_flower at yahoo.com, Eric B.Ridge ebr at tcdi.com, wsheldah at lexmark.com, elein at varlena.com

Use the Language you need
Stored Procedures 13-Mar-2003

PostgreSQL supports many stored procedure languages. This allows you to use the language most appropriate for the task at hand. The ones supported from postgresql.org are: SQL, C, PL/pgSQL, PL/Python, PL/Perl and PL/Tcl. PL/Ruby, PL/sh, and PL/Java are supported by other projects. There is an unverified rumor of the existence of a PL/bash as well.

PL/Perl, PL/Tcl and PL/Python require configuration at build time. The rest of the PostgreSQL supported functions exist by default. C server functions require that you install the server headers as well as the client headers.

PL/Perl is good for string manipulation. PL/Python is good for running aggregates and triggers. Some people prefer Tcl. SQL functions are very straightforward and PL/pgSQL is the workhorse enabling basic flow control with SQL statements. This variety is good both for people devoted to a single language or for those who like to match the implementation to the task.

These links will lead you to more information about each language.

C http://www.us.postgresql.org/users-lounge/docs/7.3/postgres/xfunc-c.html
SQL http://www.us.postgresql.org/users-lounge/docs/7.3/postgres/xfunc-sql.html
PL/pgSQL http://www.us.postgresql.org/users-lounge/docs/7.3/postgres/plpgsql.html
PL/Perl http://www.us.postgresql.org/users-lounge/docs/7.3/postgres/plperl.html
PL/Tcl http://www.us.postgresql.org/users-lounge/docs/7.3/postgres/pltcl.html
PL/Python http://www.us.postgresql.org/users-lounge/docs/7.3/postgres/plpython.html
PL/sh http://webmail.postgresql.org/~petere/plsh.html
PL/R http://www.joeconway.com/plr/
PL/Java http://pljava.sourceforge.net/
PL/Ruby http://raa.ruby-lang.org/list.rhtml?name=pl-ruby

Contributors: Roman Fail rfail at posportal.com, Joe Conway mail at joeconway.com, Neil Conway neilc at samurai.com, elein at varlena.com

Optional Foreign Keys
[GENERAL] Having an optional foreign key (ie. sometimes NULL) ? 13-Mar-2003

When thinking about using a foreign key to implement a relationship between two tables there are a few questions to ask. Suppose we have two tables, tableA and tableB. tableA has a primary key of akey and tableB has a primary key of bkey and also contains akey.

1. Can the column akey in tableB have a value, other than NULL, which is not present in tableA?
If not, you have a good candidate for a primary key. The remaining parts assume you have determined that this is true.
If so, you need to go back and think about the domain of the column akey in tableB. If it has a sometimes relationship with tableA you will need decide if you want any kind of constraint and exactly what kind of constraint will meet your requirements. This situation will not covered below.
2. Is the relationship one to one? That is, for every column tableB.akey there is one and only one row in tableA. Then you have a one to one relationship which can be implemented with a foreign key constraint.
3. Is the relationship one to many? That is, for each tableB.akey there may be one or many rows in tableA. By "many" do you mean a fixed number, a fixed set of entries in tableA or do you mean "any" number?
If you mean a fixed number or a fixed set of entries, the solution will be different--it is usually implemented using an aggregate table in the same way as a many to many relationship.
Otherwise these one to many relationships are generally implemented using foreign keys.
4. Is the relationship many to many? Can many rows in tableB correspond to 1 or many different rows in tableA? For example, if tableA were an address table and tableB were a person table, a person could have several addresses (for work, home, etc.) and also several people could share the same addresses.
Many to many relationships are best implemented by an aggregate table. This is described below.
5. Can akey in a tableB row be NULL? If so, you have a zero to one or zero to many relationship.

A one to one or one to many relationship is enforceable via a foreign key. REFERENCES will do the trick. A zero to one or many relationship can be enforced by including NOT NULL on the akey column in tableB.

	create table tableB (
	...
	akey	integer REFERENCES tableA (akey),
	...

A many to many relationship and a one to a fixed set are usually implemented by a third table called an aggregate table in which keys for tableB are matched with keys of tableA.

	create table ABagg (
		akey integer NOT NULL REFERENCES tableA (akey),
		bkey integer NOT NULL REFERENCES tableB (bkey),
		PRIMARY KEY (akey,bkey)
	);

This aggregate table should be used in queries where data from both tables is wanted. This is a case where a good view would be very helpful.

	select a.akey, b.bkey, a.info as ainfo, b.info as binfo
	from tableA a, tableB b, ABagg ab
	where ab.akey = a.akey and ab.bkey = b.bkey;

Contributors: Tom tom at vms7.com, Manfred Koizar mkoi-pg at aon.at, btober at seaworthysys.com, Ryan Mahoney ryan at paymentalliance.net, elein at varlena.com

Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!

Top