PostgreSQL is not a good database to use on read-only media. For example, distributing a read-only thesaurus application on a CD would not work.
PostgreSQL tries to write certain files as it runs even if the queries are just SELECT.
Alternative databases may be better for this type of application distribution. Suggestions were made for:
While PostgreSQL requires writing for read-only transaction it will remain a non-viable solution for a Database on CD application.
There are two basic ways to do enumerated values in PostgreSQL. The first is with a CHECK constraint on the column(s) involved and the second is with a look up table.
To use a CHECK constraint, you would define the column when defining the table, like this:
CREATE TABLE foo ( ... enum text CHECK ( enum IN ('a','b','c')), ... );The primary draw back of this methodology is its inflexibility and its inability to share enumerated lists. The CHECK constraints have to be written out for each column (or put into a function). If you wanted to add the constraint after the table was already defined or if you wanted if you wanted to change the enumerated values, you have to be running 7.4.
Using a lookup table leverages the referential integrity and allows easy updates to the enumerated data set.
CREATE TABLE enum_abc (val text NOT NULL PRIMARY KEY); INSERT INTO enum_abc values ('a'); INSERT INTO enum_abc values ('b'); INSERT INTO enum_abc values ('c'); CREATE TABLE foo ( ... enum text REFERENCES enum_abc (val), ... );If the enumerated values are very long, it would be faster to use an alternate integer key in the enum_abc table and have foo.enum reference the key instead of the value. The draw back of this methodology is that it uses more space and if you need to have an alternate key then you will need to use JOIN to bring in the actual enumerated values when selecting from foo.
The simplicity of the lookup tables is compelling whenever the enumerated data set is not cast in stone. This, I think is true even with the addition of the ability to ALTER DOMAIN CONSTRAINTS in 7.4.
One of the first things to do when a query is slow is to verify that the data types in the expressions are compatible. For example, when qualifying a constant number against a bigint, it is strongly suggested that you cast the number explicitly to a bigint. This idea was discussed in Issue #36.
A question brought up on the HACKERS list pointed out that foreign key references can have the same problem.
CREATE TABLE a (b int4 UNIQUE); CREATE TABLE c (d int8 REFERENCES a (b));
This foreign key lookup would be faster if both a.b and c.d were the same datatype.
The discussion on HACKERS centered around raising a NOTICE or a WARNING when creating a mismatched FOREIGN KEY. This notice would be very helpful. However, it is also important to keep a close eye on your relations and not have it happen in the first place. Although the mismatched FOREIGN KEY shown above will work exactly as expected, it is not as fast or clean. By definition FOREIGN KEYS should be the same datatype.
There is no behavior difference between varchar and text. The difference between varchar(n) and text is that the length of the value for varchar(n) is limited to n. The internal storage for both (all three?) types is the same, although each has its own unique type identifier.
Both varchar[(n)] and text exist because in the original postgres, text came first. Then varchar[(n)] was added for SQL spec compliance. There are no plans to eliminate one type in favor of the other or consolidate their unique type identifiers. Since they share the same internal representation, however, that also is unlikely to change.
The reasons to use varchar[(n)] are:
Consistency in the definition of character text fields in your database definition will help speed up your queries. It is suggested that you consistently use text for non-ODBC apps.
An identifier is the name of an object: a table name, a schema name, a column name, a type name, etc. Identifiers have special rules with regards to case and characters.
By default PostgreSQL folds the case of all identifiers so that case is ignored. To use case sensitive identifiers, each time you reference them, use double quoted.
SELECT foo, FOo, Foo, fOO -- all fold down to "foo" SELECT "foo", "FOo", "Foo", "fOO" -- are all differentQuoting identifiers works with dots like this:
SELECT * from "MySchema"."MyTable";The quotes are only around the identifier and do not encompass the dot.
Current and former Oracle people tend to type the tables names in upper-case from habit. This is fine until you use an application like like PostgreSQL Manager Pro which double quotes identifiers by default. After creating of all of the tables, etc. as "UPPERTABLE", all references thereafter must be "UPPERTABLE".
PostgreSQL is not spec compliant in it default folding method. Most databases fold to upper case while PostgreSQL folds to lower case. This means that in another database, defining a table as "UPPER" could be referred to later as upper. In postgreSQL it is the opposite. Defining a table as "upper" you will be able to refer to it later as UPPER. This is not likely to change.
The most common way to deal with identifiers is to never quote them and let them fold to lower case. If you must have a case sensitive identifier, then double quote it when you use it.
PgSQL, Inc. is making up a new batch of PostgreSQL
tee shirts and needs a good slogan for this image. Both the
image and the logo will be on the back.
The current idea is "Of the people, For the people", in reference to the people on the ground leading the elephant, which is bearing the wait of the people on his back ...
I think we can come up with an even snappier line that will stick in people's minds. Send your suggestions here!
Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!
Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009