In Issue #70 an algorithm was described enabling the first N rows in a grouped query. This query returned on N rows for each group.
SELECT * FROM table70 WHERE id IN ( SELECT id FROM table70 AS topn WHERE topn.query = table70.query ORDER by score, id LIMIT 2 ) ORDER BY query
It was also brought to my attention by Troels Arvin that not everyone thinks of "Top N" in this way. In other definitions of Top N, you may have more than N rows when there are tie conditions for the values in the group. For example, from Troel's RDBMS Writings, given the person table, the three youngest people might be four people of two ages.
SELECT * FROM person ORDER BY age ASC; +----------+-------------+-----+ |PERSON_ID | PERSON_NAME | AGE | +----------+-------------+-----+ | 7 | Hilda | 12 | | 8 | Bill | 12 | | 4 | Joe | 23 | | 2 | Veronica | 23 | | 3 | Michael | 27 | | 9 | Marianne | 27 | | 1 | Ben | 50 | | 10 | Michelle | 50 | | 5 | Irene | 77 | | 6 | Vivian | 77 | +----------+-------------+-----+Now, we only want the three (n=3) youngest persons displayed, i.e. a result set like this:
+----------+-------------+-----+ |PERSON_ID | PERSON_NAME | AGE | +----------+-------------+-----+ | 7 | Hilda | 12 | | 8 | Bill | 12 | | 4 | Joe | 23 | | 2 | Veronica | 23 | +----------+-------------+-----+
The difference depends on exactly what you want in your query. Using ORDER BY with LIMIT N will give you exactly N rows. You will have to write a more complex function to be able to handle the tie situations in a consistent and logical way.
As you probably saw in Robert Treat's Weekly news, the SQL2003 was finally published. An overview of some of the changes can be seen SQL:2003 Has Been Published by Andrew Eisenberg, et al. (You may need to purchase this document or be an ACM member to access this document.)
In this article, I will informally review some of the changes reviewed by Eisenberg, et al. to see how the PostgreSQL addresses these standards. The last four items in the list below are not covered in this article. They will be covered in a future article.
Statements about whether PostgreSQL supports or does not support an element is based on 7.4 at this time. Plans for 7.5 are not addressed. Quoted remarks are from the SQL:2003 Has Been Published Document.
Overview Summarized into an Executive Summary of Changes
BIT and BITVARYING data types removedPostgreSQL supports both bit and bit varying types. The reason the committee stated that these were removed was the lack of support of these types in "existing SQL database engines, and the lack of expected support in the future." This is certainly not the case for PostgreSQL. Below is an example of input support for these types.
create table bits ( b bit, bv bit varying); insert into bits values ( '1'::bit, '1'::bit varying); insert into bits values ( '101'::bit, '101'::bit varying); insert into bits values ( 'xF'::bit, 'xF'::bit varying); elein=# select * from bits; b | bv ---+------ 1 | 1 1 | 101 1 | 1111 (3 rows)
BIGINT data type addedThe BIGINT type may be a representation of a 64 bit or a 32 bit integer depending on the implementation and platform. PostgreSQL supports BIGINT. It is equivalent to an int8 in PostgreSQL.
Permissions on running ProceduresThis item adds the ability to define a function so that it runs with either the privileges of its definer or the privileges of its invoker. This feature is supported in PostgreSQL with the CREATE FUNCTION options [EXTERNAL] SECURITY INVOKER and [EXTERNAL] SECURITY DEFINER. The default is to run the function with the permissions of the invoker of the function.
Parameter syntax in ProceduresThis SQL2003 feature allows the use of function_name.parameter_name inside of procedures in order to avoid name conflicts when columns referenced inside of the procedure are the same as the parameter name.
PostgreSQL does not support named parameters in its function definitions so it does not support this syntax which uses named parameters. However, the underlying problem of conflicting names is solved with PostgreSQL with the use of the ALIAS declaration of the parameters in plpgsql. The example below shows the different ways this is used. Both functions are only minimally defined in order to show the use of the named parameter and alias.
Extensions added to CREATE TABLE ASThis extension enables renaming of columns when using the CREATE TABLE AS statement. PostgreSQL supports this. It is handy for rearranging and renaming columns.
elein=# select * from yyy; one | two | data -----+-----+------ 2 | | 3 3 | | 4 4 | | 5 (3 rows) elein=# create table xxx (a, b, c) as (select * from yyy); SELECT elein=# select * from xxx; a | b | c ---+---+--- 2 | | 3 3 | | 4 4 | | 5 (3 rows)
An interesting note in the SQL2003: Published document is that
the syntax of the CREATE TABLE AS statement is similar to what is known as Materialized Query Tables (MQTs) or Materialized Views in many commercial products, the semantics are different.The semantics are different in that there is no dependency between the tables. Materialized views have an implied relationship to the underlying tables.
Sequence Generator AddedSQL2003 added Sequences to the specification. PostgreSQL already supports this with two extra options. The options specified by the spec are:
Identify Column typeThe equivalent of an IDENTITY column in PostgreSQL is a SERIAL or SERIAL8. This feature creates a sequence associated with a column in a table. As specified the sequence can also have its parameters set (see above) in the column definition like this:
CREATE TABLE parts ( partnum INTEGER GENERATED ALWAYS AS IDENTITY ( START WITH 100 INCREMENT BY 10 MIN VALUE 10), partname VARCHAR(40) );The PostgreSQL equivalent requires that you ALTER the sequence after creation if you want options different from the default options. And this requires that you know the sequence name. In PostgreSQL the default sequence name is tablename_columnname_seq.
CREATE TABLE parts ( partnum SERIAL, partname VARCHAR(40) ); ALTER SEQUENCE 'parts_partnum_seq' START WITH 100 INCREMENT BY 10 MIN VALUE 10;
PostgreSQL allows you to say SERIAL or SERIAL8 where the spec requires that you declare the column as INTEGER AS IDENTITY or BIGINT AS IDENTITY.
An interesting limitation in the specifications is that one table can have at most one IDENTITY column. In PostgreSQL you may have as many SERIAL columns as you like in a table.
There is one more option on the 2003 specification which is GENERATED ALWAYS or GENERATED BY DEFAULT. The first raises an error if any value inserted into the column other than DEFAULT. The second option will allow assignment of the value in an UPDATE or INSERT statement. PostgreSQL supports the second option by default only and does not support the GENERATED ALWAYS option.
Generated Column typeGenerated column types are the ability to define a column in a table based on an expression using other columns in a row. Like the IDENTITY column, the option GENERATED ALWAYS is specified. For example:
CREATE TABLE employees ( emp_id integer, salary decimal(7,2), bonus decimal(7,2), total_com GENERATED ALWAYS AS (SALARY + BONUS) );The value of total_com is stored when a row is INSERTED into the table. The overview did not say whether the value of the generated column is updated when the base columns are updated. It only addressed the issue of INSERT. It specifically gave an example where the option GENERATED BY DEFAULT is not supported. When a row is inserted, the value of the generated column must not have a value. It must be designed DEFAULT.
insert into employees values (1234, 75000, 1500, DEFAULT );
This behavior is similar to virtual columns. Virtual columns had more or less the same declaration, but the value was not stored. It was calculated on the fly for each selection. Virtual columns were supported in Illustra. PostgreSQL does not support virtual columns.
PostgreSQL does not support generated column types in the CREATE TABLE definitions. The same behavior can be created using a TRIGGER function. With a TRIGGER function, the functionality is less limiting.
Extensions added to CREATE TABLE LIKE
The CREATE TABLE LIKE statement has been extended to include the addition of GENERATED columns and IDENTITY columns. PostgreSQL does not support the syntax in CREATE TABLE so it does not support it in CREATE TABLE LIKE.
It is important to note that in PostgreSQL when creating a table LIKE another table that has a serial column, a new sequence is not created. That must be done by a separate CREATE SEQUENCE and ALTER TABLE to set the default for the column to be the nextval() of the column.
Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!
Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009