Varlena, LLC | PostgreSQL General Bits Newsletter

PostgreSQL Training,
Consulting & Support

By A. Elein Mustain

8-Sep-2003 Issue: 42

Archives |

General Tidbits |

Google General Bits |

Prev |

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

Read Only Database Media
[GENERAL] delivering database stand-alone	01-Sept-2003

PostgreSQL is not a good database to use on read-only media. For example, distributing a read-only thesaurus application on a CD would not work.

PostgreSQL tries to write certain files as it runs even if the queries are just SELECT.

log file -- this can be directed to /dev/null
pgstat.stat -- Stats can be turned off
pg_clog -- Used to track transaction states

There are ways to stop the writing to the log file and pgstat.stat, but pg_clog cannot be turned off. It is unclear whether relocating the pg_clog file on installation of the CD is a viable solution.

Alternative databases may be better for this type of application distribution. Suggestions were made for:

Dan Bernstein's Constant Database
- Hash based. No sorting or partial matches.
- No SQL
- Licensing confusion issues
Embedded Firebird
- Proprietary Software
ISAM files
Various OO Databases

While PostgreSQL requires writing for read-only transaction it will remain a non-viable solution for a Database on CD application.

Contributors: Joost Kremers joostkremers at fastmail.fm, Bruce Momjian pgman at candle.pha.pa.us, Ang Chin Han angch at bytecraft.com.my, Christopher Browne cbbrowne at acm.org, Jacob Hanson jacobhanson at firsthealth.com, Ron Johnson ron.l.johnson at cox.net, Dann Corbit DCorbit at connx.com

Enumerated Values
[GENERAL] What is the good equivalent for ENUM ? 3-Sept-2003

There are two basic ways to do enumerated values in PostgreSQL. The first is with a CHECK constraint on the column(s) involved and the second is with a look up table.

To use a CHECK constraint, you would define the column when defining the table, like this:

	CREATE TABLE foo (
		...
		enum	text	CHECK ( enum IN ('a','b','c')),
		...
	);

The primary draw back of this methodology is its inflexibility and its inability to share enumerated lists. The CHECK constraints have to be written out for each column (or put into a function). If you wanted to add the constraint after the table was already defined or if you wanted if you wanted to change the enumerated values, you have to be running 7.4.

Using a lookup table leverages the referential integrity and allows easy updates to the enumerated data set.

	CREATE TABLE enum_abc (val text NOT NULL PRIMARY KEY);
	INSERT INTO enum_abc values ('a');
	INSERT INTO enum_abc values ('b');
	INSERT INTO enum_abc values ('c');
	
	CREATE TABLE foo (
		...
		enum text REFERENCES enum_abc (val),
		...
	);

If the enumerated values are very long, it would be faster to use an alternate integer key in the enum_abc table and have foo.enum reference the key instead of the value. The draw back of this methodology is that it uses more space and if you need to have an alternate key then you will need to use JOIN to bring in the actual enumerated values when selecting from foo.

The simplicity of the lookup tables is compelling whenever the enumerated data set is not cast in stone. This, I think is true even with the addition of the ability to ALTER DOMAIN CONSTRAINTS in 7.4.

Contributors: Bruno BAGUETTE pgsql-ml at baguette.net, Shridhar Daithankar shridhar_daithankar at persistent.co.in, Vivek Khera khera at kcilink.com, Dennis Gearon gearond at fireserve.net, Ron Johnson ron.l.johnson at cox.net, Bruce Momjian pgman at candle.pha.pa.us, scott.marlowe scott.marlowe at ihs.com, Joseph Hepburne Healy j.healy at ugrad.unimelb.edu.au

Another way to slow down queries
[HACKERS] FK type mismatches? 4-Sept-2003

One of the first things to do when a query is slow is to verify that the data types in the expressions are compatible. For example, when qualifying a constant number against a bigint, it is strongly suggested that you cast the number explicitly to a bigint. This idea was discussed in Issue #36.

A question brought up on the HACKERS list pointed out that foreign key references can have the same problem.

	CREATE TABLE a (b int4 UNIQUE);
	CREATE TABLE c (d int8 REFERENCES a (b));

This foreign key lookup would be faster if both a.b and c.d were the same datatype.

The discussion on HACKERS centered around raising a NOTICE or a WARNING when creating a mismatched FOREIGN KEY. This notice would be very helpful. However, it is also important to keep a close eye on your relations and not have it happen in the first place. Although the mismatched FOREIGN KEY shown above will work exactly as expected, it is not as fast or clean. By definition FOREIGN KEYS should be the same datatype.

Contributors: Neil Conway neilc at samurai.com, Tom Lane tgl at sss.pgh.pa.us, Peter Eisentraut peter_e at gmx.net, Bruce Momjian pgman at candle.pha.pa.us, Robert Treat xzilla at users.sourceforge.net

What is the difference between text and varchar?
[GENERAL] Limited varchar, unlimited varchar, or text? 24-Jul-2003

There is no behavior difference between varchar and text. The difference between varchar(n) and text is that the length of the value for varchar(n) is limited to n. The internal storage for both (all three?) types is the same, although each has its own unique type identifier.

Both varchar[(n)] and text exist because in the original postgres, text came first. Then varchar[(n)] was added for SQL spec compliance. There are no plans to eliminate one type in favor of the other or consolidate their unique type identifiers. Since they share the same internal representation, however, that also is unlikely to change.

The reasons to use varchar[(n)] are:

ODBC handling of varchar is better than text
Enforcing limits on data lengths.
Portability between Databases.

The reasons to use text are:

Fewer casts, implicit or explicit (faster)
It is the native character type
No need to enforce length limits

Consistency in the definition of character text fields in your database definition will help speed up your queries. It is suggested that you consistently use text for non-ODBC apps.

Contributors: Curtis Hawthorne mr_person at mrperson.org, Dmitry Tkach dmitry at openratings.com, scott.marlowe scott.marlowe at ihs.com, Andrew Ayers aayers at eldocomp.com, Tom Lane tgl at sss.pgh.pa.us, Greg Stark gsstark at mit.edu

Delimited Identifiers and PostgreSQL Defaults
[GENERAL] Can I turn the case sensitive off 24-Jul-2003

An identifier is the name of an object: a table name, a schema name, a column name, a type name, etc. Identifiers have special rules with regards to case and characters.

By default PostgreSQL folds the case of all identifiers so that case is ignored. To use case sensitive identifiers, each time you reference them, use double quoted.

    SELECT foo, FOo, Foo, fOO          -- all fold down to "foo"
    SELECT "foo", "FOo", "Foo", "fOO"  -- are all different

Quoting identifiers works with dots like this:

	SELECT * from "MySchema"."MyTable";

The quotes are only around the identifier and do not encompass the dot.

Current and former Oracle people tend to type the tables names in upper-case from habit. This is fine until you use an application like like PostgreSQL Manager Pro which double quotes identifiers by default. After creating of all of the tables, etc. as "UPPERTABLE", all references thereafter must be "UPPERTABLE".

PostgreSQL is not spec compliant in it default folding method. Most databases fold to upper case while PostgreSQL folds to lower case. This means that in another database, defining a table as "UPPER" could be referred to later as upper. In postgreSQL it is the opposite. Defining a table as "upper" you will be able to refer to it later as UPPER. This is not likely to change.

The most common way to deal with identifiers is to never quote them and let them fold to lower case. If you must have a case sensitive identifier, then double quote it when you use it.

Contributors: Terence Chang TChang at nqueue.com, Arguile arguile at lucentstudios.com, Stephan Szabo sszabo at megazone.bigpanda.com, Thomas Kellerer spam_eater at gmx.net, Reuben D. Budiardja techlist at voyager.phys.utk.edu, Andrew Sullivan andrew at libertyrms.info, Ron Johnson ron.l.johnson at cox.net, Tom Lane tgl at sss.pgh.pa.us

Looking for a good Logo
[pgsql-advocacy] Need a good slogan to use for ... 06-Sept-2003

PgSQL, Inc. is making up a new batch of PostgreSQL tee shirts and needs a good slogan for this image. Both the image and the logo will be on the back.

The current idea is "Of the people, For the people", in reference to the people on the ground leading the elephant, which is bearing the wait of the people on his back ...

I think we can come up with an even snappier line that will stick in people's minds. Send your suggestions here!

Contributors: Marc G. Fournier scrappy at hub.org

Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!

Top