varlena
varlena
PostgreSQL Training,
Consulting & Support
General Bits
By A. Elein Mustain

06-Oct-2003 Issue: 46

Archives | General Tidbits | Google General Bits | Docs | Castellano | PortuguÍs | Subscriptions | Notifications | | Prev

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

The Answers to the Trivia Quiz
A for everyone who tried! 03-Oct-2003

  • 1. What does SELECT coalesce( x, y ); return when:
      coalesce() takes the first argument if it is not NULL otherwise it returns the second argument. coalesce() can take a variable number of arguments. It returns the first non-null argument or null if the last argument is NULL.
    • a. x is NULL, y is 16
      • coalesce(NULL,16) returns 16
    • b. x is 16, y is NULL
      • coalesce(16,NULL) returns 16
    • c. x is NULL, y is NULL
      • coalesce(NULL,NULL) returns NULL
    • d. x is 16, y is 17
      • coalesce(16,17) returns 16

  • 2. Will these WHERE clauses use an index if one exists on the column?
      Most of these where clause have different data types in the expressions. When a type has to be converted, it usually does not use the index.
    • a. WHERE timestamp_value = '12/31/59'
      • No. a timestamp and a text string are different types.
    • b. WHERE bigint_value = 3
      • No. A bigint and an integer are different types.
    • c. WHERE text_value LIKE '% the %'
      • No. LIKE clauses must scan the whole table.
    • d. WHERE text_value in ( 'val_one', 'val_two', 'val_three')
      • No. IN, like OR, clauses must scan the whole table.
    • e. which one of these will give you an unexpected answer?
      • a. The text will be converted into a timestamp, not a date. If you expected the timestamp to be truncated to a date value, you are wrong. Full timestamps rarely match a date with no hours, minutes, seconds.

  • 3. What would the ORDER BY clause be when you wanted to sort NULLs in a column to the top, but keep the remaining columns in regular sort order?
      d. ORDER BY column IS NOT NULL ASC, column
      alternatively ORDER BY COLUMN IS NULL DESC, column
    • a. ORDER BY NULL
    • b. ORDER BY column WITH NULL, column
    • c. ORDER BY column IS NOT NULL ASC
    • d. ORDER BY column IS NOT NULL DESC, column
    • e. ORDER BY column

  • 4. In postgresql.conf, max_fsm_relations should be set to:
      c. The number of all of the tables in all of the databases.
    • a. 10
    • b. The maximum number of connections allowed
    • c. The number of all of the tables in all of the databases.
    • d. The amount of free cache space on your hard drive

  • 5. Timestamps with timezones are stored with the:
      a. UTC (Notice that the input timezone is not stored.)
    • a. UTC time zone
    • b. Server time zone
    • c. Client time zone
    • d. Specified time zone

  • 6. Which of the following words should not be used as identifiers in PostgreSQL:
      DEFAULT, IS, PRIMARY,and SOME should not be used as identifiers. The PostgreSQL and SQL reserved words are listed in:
      • 7.3 Users Guide in Appendix B
      • 7.4 Chapter 8 (Appendices) Appendix B in 7.4 documentation.
      ARE BEGIN CURRENT_ROLE DEFAULT
      IS MOVE NOT OFFSET
      OPERATOR PRIMARY SOME UNDER

  • 7.1 RULES should be used for functionality applicable to:
  • 7.2 TRIGGERS should be used for functionality applicable to:
      RULES operate on an entire query and are used to create views.
      TRIGGERS operate on a row by row basis and are implemented with functions.
    • a. a query
    • b. each row
    • c. views
    • d. functions

  • 8. A plpgsql function is executed
      All pl functions are executed in the postmaster server, also called the backend.
    • a. in the server
    • b. in the client application
    • c. in psql
    • d. in either the client or the server

  • 9. A function created with IMMUTABLE means that:
      b. It will always return the same value for the same input.
    • a. The function will never change
    • b. It will always return the same value for the same input
    • c. The function is expected to never raise an error
    • d. The function returns null for a null argument

  • 10. A function created with STRICT means that:
      c. It will not be run if an argument is null.
    • a. It should not be cached
    • b. It should raise an error on any failure
    • c. Will not be run if an argument is null
    • d. It will always return a non null value

Contributors: elein at varlena.com
Redirecting errors
[SQL] Capturing pgsql ERRORS/NOTICES to file 26-Sep-2003

When running psql from the shell you can control both standard out and standard error. Psql will also help format your output. Here are some common examples of using psql from the shell.

Redirect query output.

	psql -c "select * from mytable;" > query.out 
Redirect standard error to the same file as standard out
	psql -c "select * from mytable;" > query.out 2>&1 
Redirect query input.
	psql -c "select * from mytable;" < query.sql 
Inline query input. Make sure the end tag starts in the first column
	psql << QRY_END
	update mytable set prod_count = prod_count -1 where prod_id = 117;
	QRY_END
Comma separated value file.
-A suppresses alignment. -t surpresses table headers and column names.
	psql -At -F ','  -c "select * from mytable;" > query.csv
HTML output.
More handy if called from a php or perl client for a web app.
	psql -H -c "select * from mytable" 

For more information about redirection and shell scripting, see man for the shell you use. These examples use bash but will work in most shells.

For more information about psql's formatting capabilities, see man psql or the PostgreSQL Reference Guide under Client Applications.

Contributors: George Weaver georgew1 at mts.net, Josh Berkus josh at agliodbs.com, Wei Weng wweng at kencast.com, elein at varlena.com
What DISTINCT ON can show you
[GENERAL] difference when using 'distinct on' 12-Sep-2003

The DISTINCT qualifier to the SELECT statement enables you to eliminate duplicates from a target list. This used in conjunction with an ORDER BY statement can give you some interesting information about your data.

This table definition will be used as an example. It contains two schools, three tests in two subjects and six students.

	create table fortysix (
	   school      text,
	   subject     text,
	   test        integer,
	   student_id  integer,
	   score       integer );

Since all of the rows are unique (barring data entry errors), the following statement will return all of the rows.

	SELECT DISTINCT school, subject, test, student_id, score
	FROM fortysix;
That is not particularly useful. However, if you limit the target list you can find out what schools were involved, what subjects were involved and what students were involved.
	=# -- What schools were involved
	=# select distinct school from fortysix;
	 school 
	--------
	 Carter
	 Ohlone
	(2 rows)
	
	=# 
	=# -- What subjects were involved
	=# select distinct subject from fortysix;
	 subject 
	---------
	 Math
	 Reading
	(2 rows)
	
	=# 
	=# -- What students were involved in each school
	=# select distinct school, student_id from fortysix;
	 school | student_id 
	--------+------------
	 Carter |       1024
	 Carter |       1025
	 Carter |       1026
	 Ohlone |       1027
	 Ohlone |       1029
	 Ohlone |       1400
	(6 rows)

Using DISTINCT ON (), particularly with an ORDER BY can give you even more information. The columns listed in the parentheses are the values that must be unique. To control which of the actual rows gets selected, you can sort by what you want; you must also include the DISTINCT columns in the ORDER BY clause. In the first case, we sort on the score descending in order to select the row with the highest score.

	=# -- Who got the highest score on any test in each subject in each school
	=# select distinct on (school, subject) school, subject, test, student_id, score
	-# from fortysix
	-# order by school, subject, score desc;
	 school | subject | test | student_id | score 
	--------+---------+------+------------+-------
	 Carter | Math    |    1 |       1026 |   100
	 Carter | Reading |    1 |       1026 |   100
	 Ohlone | Math    |    3 |       1400 |    97
	 Ohlone | Reading |    3 |       1400 |   100
	(4 rows)
	
	=# 
	=# -- Who got the highest score on each test in each subject in each school
	=# select distinct on (school, subject, test) 
	=#    school, subject, test, student_id, score
	-# from fortysix
	-# order by school, subject, test, score desc;
	 school | subject | test | student_id | score 
	--------+---------+------+------------+-------
	 Carter | Math    |    1 |       1026 |   100
	 Carter | Math    |    2 |       1026 |    99
	 Carter | Math    |    3 |       1026 |   100
	 Carter | Reading |    1 |       1026 |   100
	 Carter | Reading |    2 |       1026 |   100
	 Carter | Reading |    3 |       1026 |    99
	 Ohlone | Math    |    1 |       1027 |    92
	 Ohlone | Math    |    2 |       1400 |    95
	 Ohlone | Math    |    3 |       1400 |    97
	 Ohlone | Reading |    1 |       1027 |    99
	 Ohlone | Reading |    2 |       1400 |    97
	 Ohlone | Reading |    3 |       1400 |   100
	(12 rows)
	
	=# 
	=# -- Which student got the highest score on each test in each subject
	=# select distinct on (subject) school, subject, student_id, score
	-# from fortysix
	-# order by subject, score desc;
	 school | subject | student_id | score 
	--------+---------+------------+-------
	 Carter | Math    |       1026 |   100
	 Carter | Reading |       1026 |   100
	(2 rows)
In the third example only one score was selected for each subject although more than one may have gotten the score.

DISTINCT ON () can be used similarly to a GROUP BY clause, however, there is no aggregate required. Notice that you must include the columns listed in the DISTINCT again in the target list if you want them included in the output. Also notice that there is no comma separating the DISTINCT clause from the target list.

Pay careful attention to the natural language statements which describe the queries. They tell you how to construct the query. In the second query, the question is:

Who got the highest score on each test in each subject in each school
The superlative highest (or lowest) tells us what column to order by and in what direction. The word each signifies the grouping. In English, each test, subject, school very usually corresponds to grouping by the opposite order: school, subject, test.
	=# select distinct on (school, subject, test) 
	-#    school, subject, test, student_id, score
	-# from fortysix
	-# order by school, subject, test, score desc;

Contributors: Johnson, Shaunn SJohnson6 at bcbsm.com, Bruno Wolff III bruno at wolff.to, Stephan Szabo sszabo at megazone.bigpanda.com, elein at varlena.com
Sequence Tricks for Statistical Granularity
[SQL] Result set granularity.. 27-Sep-2003

For statistical reasons you may want to sample a set of data instead of calculating results based on the entire data set. For example, you may want to select one out of every 5 rows.

You can do this using a temporary sequence. Create the temporary sequences and select the next value for each row in the query. Do this in a subselect and in the main selection filter out based on the sequence modulo 5.

	CREATE TEMP SEQUENCE number;
	SELECT * FROM ( 
	   SELECT *, nextval('number') AS number FROM datatable ) foo
	WHERE foo.number % 5 = 0;

If you want every 5th row from an ordered sequence, simply order by the appropriate column.

Contributors: Rasmus Aveskogh rasmus at defero.se, Harald Fuchs nospam at sap.com, scott.marlowe scott.marlowe at ihs.com
Transaction Times redux
[SQL] now() in loop statement 29-Sep-2003

The differences between clock time and transaction time functions was covered previously in Issue # 16. The question keeps coming up so here is more explanation in terms of statements and transactions.

Timestamp functions in PostgreSQL are:

functionValueData Type
current_timestampTransaction Start Timestamptimestamp with timezone
localtimestampTransaction Start Timestamptimestamp without timezone
current_dateTransaction Start Datedate
current_timeTransaction Start Timetime with timezone
localtimeTransaction Start Timetime
now()Transaction Start Timestamptimestamp with timezone
timeofday()Clock timestamp text
To the point, if you want clock time, use timeofday(). If transaction start time is adequate, use that--it is microseconds faster.

All of these functions return the time that the current transaction began as a timestamp, date or time except timeofday() which returns the actual time of day as a text field.

To make timeofday() return a timestamp or date cast the function as a timestamp or date. To make it return a time type, cast it first to a timestamp and then to time.

Casted SelectionValue
select timeofday()::timestamp;2003-10-04 19:32:58.236439
select timeofday()::date;2003-10-04
select timeofday()::timestamp::time;19:33:14.039576

If you have one of the transaction time functions as a DEFAULT in a column, the the time will be set to the transaction time of the insert. When using the transaction time functions in block transactions, all references to any of them will be the same. This is very useful when syncronizing timestamps on a set of records. A trigger is part of the transaction of the SQL statement causing it to be invoked. So any reference to transaction time functions in trigger functions will be the same as any reference in any other part of the originating statement.

Contributors: Kerv keresztes_m at yahoo.com, Tomasz Myrta jasiek at klaster.net, Dan Langille dan at langille.org, Christoph Haller ch at rodos.fzk.de


Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!
Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009

Top
Google
Search General Bits & varlena.com Search WWW