When running psql from the shell you can control both standard out and standard error. Psql will also help format your output. Here are some common examples of using psql from the shell.
Redirect query output.
psql -c "select * from mytable;" > query.outRedirect standard error to the same file as standard out
psql -c "select * from mytable;" > query.out 2>&1Redirect query input.
psql -c "select * from mytable;" < query.sqlInline query input. Make sure the end tag starts in the first column
psql << QRY_END update mytable set prod_count = prod_count -1 where prod_id = 117; QRY_ENDComma separated value file.
-A suppresses alignment. -t surpresses table headers and column names.
psql -At -F ',' -c "select * from mytable;" > query.csvHTML output.
More handy if called from a php or perl client for a web app.
psql -H -c "select * from mytable"
For more information about redirection and shell scripting, see man for the shell you use. These examples use bash but will work in most shells.
For more information about psql's formatting capabilities, see man psql or the PostgreSQL Reference Guide under Client Applications.
The DISTINCT qualifier to the SELECT statement enables you to eliminate duplicates from a target list. This used in conjunction with an ORDER BY statement can give you some interesting information about your data.
This table definition will be used as an example. It contains two schools, three tests in two subjects and six students.
create table fortysix ( school text, subject text, test integer, student_id integer, score integer );
Since all of the rows are unique (barring data entry errors), the following statement will return all of the rows.
SELECT DISTINCT school, subject, test, student_id, score FROM fortysix;That is not particularly useful. However, if you limit the target list you can find out what schools were involved, what subjects were involved and what students were involved.
=# -- What schools were involved =# select distinct school from fortysix; school -------- Carter Ohlone (2 rows) =# =# -- What subjects were involved =# select distinct subject from fortysix; subject --------- Math Reading (2 rows) =# =# -- What students were involved in each school =# select distinct school, student_id from fortysix; school | student_id --------+------------ Carter | 1024 Carter | 1025 Carter | 1026 Ohlone | 1027 Ohlone | 1029 Ohlone | 1400 (6 rows)
Using DISTINCT ON (), particularly with an ORDER BY can give you even more information. The columns listed in the parentheses are the values that must be unique. To control which of the actual rows gets selected, you can sort by what you want; you must also include the DISTINCT columns in the ORDER BY clause. In the first case, we sort on the score descending in order to select the row with the highest score.
=# -- Who got the highest score on any test in each subject in each school =# select distinct on (school, subject) school, subject, test, student_id, score -# from fortysix -# order by school, subject, score desc; school | subject | test | student_id | score --------+---------+------+------------+------- Carter | Math | 1 | 1026 | 100 Carter | Reading | 1 | 1026 | 100 Ohlone | Math | 3 | 1400 | 97 Ohlone | Reading | 3 | 1400 | 100 (4 rows) =# =# -- Who got the highest score on each test in each subject in each school =# select distinct on (school, subject, test) =# school, subject, test, student_id, score -# from fortysix -# order by school, subject, test, score desc; school | subject | test | student_id | score --------+---------+------+------------+------- Carter | Math | 1 | 1026 | 100 Carter | Math | 2 | 1026 | 99 Carter | Math | 3 | 1026 | 100 Carter | Reading | 1 | 1026 | 100 Carter | Reading | 2 | 1026 | 100 Carter | Reading | 3 | 1026 | 99 Ohlone | Math | 1 | 1027 | 92 Ohlone | Math | 2 | 1400 | 95 Ohlone | Math | 3 | 1400 | 97 Ohlone | Reading | 1 | 1027 | 99 Ohlone | Reading | 2 | 1400 | 97 Ohlone | Reading | 3 | 1400 | 100 (12 rows) =# =# -- Which student got the highest score on each test in each subject =# select distinct on (subject) school, subject, student_id, score -# from fortysix -# order by subject, score desc; school | subject | student_id | score --------+---------+------------+------- Carter | Math | 1026 | 100 Carter | Reading | 1026 | 100 (2 rows)In the third example only one score was selected for each subject although more than one may have gotten the score.
DISTINCT ON () can be used similarly to a GROUP BY clause, however, there is no aggregate required. Notice that you must include the columns listed in the DISTINCT again in the target list if you want them included in the output. Also notice that there is no comma separating the DISTINCT clause from the target list.
Pay careful attention to the natural language statements which describe the queries. They tell you how to construct the query. In the second query, the question is:
Who got the highest score on each test in each subject in each schoolThe superlative highest (or lowest) tells us what column to order by and in what direction. The word each signifies the grouping. In English, each test, subject, school very usually corresponds to grouping by the opposite order: school, subject, test.
=# select distinct on (school, subject, test) -# school, subject, test, student_id, score -# from fortysix -# order by school, subject, test, score desc;
For statistical reasons you may want to sample a set of data instead of calculating results based on the entire data set. For example, you may want to select one out of every 5 rows.
You can do this using a temporary sequence. Create the temporary sequences and select the next value for each row in the query. Do this in a subselect and in the main selection filter out based on the sequence modulo 5.
CREATE TEMP SEQUENCE number; SELECT * FROM ( SELECT *, nextval('number') AS number FROM datatable ) foo WHERE foo.number % 5 = 0;
If you want every 5th row from an ordered sequence, simply order by the appropriate column.
The differences between clock time and transaction time functions was covered previously in Issue # 16. The question keeps coming up so here is more explanation in terms of statements and transactions.
Timestamp functions in PostgreSQL are:
To the point, if you want clock time, use timeofday(). If transaction start time is adequate, use that--it is microseconds faster.
All of these functions return the time that the current transaction began as a timestamp, date or time except timeofday() which returns the actual time of day as a text field.
To make timeofday() return a timestamp or date cast the function as a timestamp or date. To make it return a time type, cast it first to a timestamp and then to time.
If you have one of the transaction time functions as a DEFAULT in a column, the the time will be set to the transaction time of the insert. When using the transaction time functions in block transactions, all references to any of them will be the same. This is very useful when syncronizing timestamps on a set of records. A trigger is part of the transaction of the SQL statement causing it to be invoked. So any reference to transaction time functions in trigger functions will be the same as any reference in any other part of the originating statement.
Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!
Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009