Varlena, LLC | PostgreSQL General Bits Newsletter

PostgreSQL Training,
Consulting & Support

By A. Elein Mustain

06-Jan-2003 Issue: 7

Archives |

General Tidbits |

Google General Bits |

Prev |

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

2002 Linux New Media Editors Choice Award for Best Database for PostgreSQL
[ANNOUNCE] PostgreSQL Wins 2002 Linux New Media Award	31-Dec-2002

Congratulations to the PostgreSQL Global Development Team!

For the third year in a row, PostgreSQL has won the Linux New Media Editors Choice Award for Best Database.

http://www.linux-magazin.de/Artikel/ausgabe/2002/12/award/award.html

Contributors: Marc G. Fournier scrappy at hub.org

7.3.1 Release
[GENERAL] v7.3.1 Bundled and Released ... 22-Dec-2002

PostgreSQL 7.3.1 is now the most current stable release. It is primarily a bug fix release, however, the major release number of the libpq library was increased. It is recommended that clients recompile with the new libpq release number.

There were some initial problems with the download links, but they have been fixed.

Contributors: pgsql-general, pgsql-hackers

Character Array Input
[GENERAL] no title 22-Dec-2002

Double quotes are special characters in character array input. Their behavior within array input is not very clear. The behaviour improved in 7.2 but perhaps it could be improved further.

Output
Input	'{abc"def,defghi",jklmnop}'
7.2 and later	'{"abcdef,defghi","jklmnop"}'
Prior to 7.2	{"def,defghi","jklmnop"}
Desired?	'{"abc\"def","defghi\"","jklmnop"}'

The behavior prior to 7.2 was clearly wrong. However, 7.2 and later appears to gobble up the embedded double quotes.

The desired behavior above seems very reasonable. Changing a fundamental input routine, however, could break existing applications. This begs the question: Is it worth changing the input routine? If you have opinions, you can send them to pgsql-general.

Contributors: Bill.Allie at mug.org, Tom Lane tgl at sss.pgh.pa.us

Who's the super user?
[GENERAL] Database superuser 20-Dec-2002

The superusers' domain is usually all of the databases in an installation. To find the superusers of an installation you can query:

	SELECT usename FROM pg_user WHERE usesuper;

For PostgreSQL, the superuser is usually postgres, but others can be added when installing PostgreSQL.

In 7.3, it is possible to configure users per database using the DB_USER_NAMESPACE runtime option. This option is off by default.

When this option is used, user names will be referenced as name@database. Users with access to the whole installation will be referenced as name@. In this way, they may be superusers as name@databaseONE but an ordinary user as name@databaseTWO.

There is a caution on this option that says:

Note: This feature is intended as a temporary measure until a complete solution is found. At that time, this option will be removed.

It is suggested that some serious consideration of this warning made before using it.

Contributors: Oliver Elphick olly at lfix.co.uk javier garcia andresjavier.garcia at wanadoo.es elein at varlena.com

Measuring query time
[GENERAL] Measuring CPU time use? 18-Dec-2002

When measuring the efficacy of a query, it is rarely the CPU time you are interested in. It almost always is the I/O time.

Plain, familiar EXPLAIN ANALYZE is really the best tool to use to analyze how good a query is. Red Hat also recently released their "Visual Explain 1.1" tools.

psql does have a timing feature that gives the elapsed time of a query along with the results. (If you do EXPLAIN ANALYZE with \timing on, however, it will time the EXPLAIN ANALYZE rather than the query.)

	=# \timing
	=# select * from pg_class;
	Time: 2.52 ms

Another alternative is to log statement statistics with the configuration option show_statement_stats, however the EXPLAIN ANALYZE is much clearer. NOTE: show_statement_stats will be renamed log_statement_stats, probably in 7.4.

	LOG:  QUERY STATISTICS
	! system usage stats:
	!       3.843989 elapsed 2.220000 user 0.340000 system sec
	!       [2.260000 user 0.360000 sys total]
	!       1797/0 [1874/4] filesystem blocks in/out
	!       0/0 [0/1] page faults/reclaims, 0 [0] swaps
	!       0 [0] signals rcvd, 0/144 [5/149] messages rcvd/sent
	!       1410/238 [1458/246] voluntary/involuntary context switches
	! buffer usage stats:
	!       Shared blocks:       1799 read,          0 written, buffer hit rate = 7.08%
	!       Local  blocks:          0 read,          0 written, buffer hit rate = 0.00%
	!       Direct blocks:          0 read,          0 written

Brute force methods to measure query duration are also quite common. For example the unix time facility enables the following command. It is the real time is what is pertinent since sys and user measure the time psql takes rather than the backend. I also found that having the results paged (via less) also affects the overall time.

	time psql -c "select * from pg_class;"
	real    0m5.745s
	user    0m0.040s
	sys     0m0.000s

Contributors: Jessica Blank jb at twu.net, Joe Conway mail at joeconway.com, scott.marlowe scott.marlowe at ihs.com, Justin Clift justin at postgresql.org, Tom Lane tgl at sss.pgh.pa.us, Keehan Mallon keehan_mallon at yahoo.com, Alvaro Herrera alvherre at dcc.uchile.cl

DISTINCT operators
[GENERAL] select distinct point? 03-Jan-2002

It is important to understand the operators' semantics when using geometric data types. It is also important to understand how different operators affect each other.

When you use DISTINCT, it relies on the availability of a comparison operator ('<') and an equal operator ('=') to sort and eliminate duplicates.

For example, there is no '<' operator for points and therefore you cannot SELECT DISTINCT for that column:

	slo= create table test (p point);
	CREATE TABLE
	slo= select distinct * from test;
	ERROR:  Unable to identify an ordering operator '<' for type 'point'
        	Use an explicit ordering operator or modify the query

You can select DISTINCT over boxes because boxes has both comparison and equality operators.

However, you must be aware of the semantics of the equality operator for boxes when using DISTINCT. As defined, Boxes are equal if they have the same area, not if they are constructed from the same points. So the following query may be a surprise if you are not aware of that fact:

	regression=# create table test2 (b box);
	CREATE TABLE
	regression=# insert into test2 values ('(1,1), (2,2)');
	INSERT 680713 1
	regression=# insert into test2 values ('(1,1), (1.5,3)');
	INSERT 680714 1
	regression=# select * from test2;
       	b
	---------------
 	(2,2),(1,1)
 	(1.5,3),(1,1)
	(2 rows)

	regression=# select distinct * from test2;
      	b
	-------------
 	(2,2),(1,1)
	(1 row)

If you are a developer of data types, consider the effect that the semantics of your operators will have on SORT and DISTINCT. Ensure that the semantics for your datatype are conventional for the object in question.

Contributors: Greg Stark gsstark at mit.edu, Tom Lane tgl at sss.pgh.pa.us, elein at varlena.com

Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!

Top