Varlena, LLC | PostgreSQL General Bits Newsletter

PostgreSQL Training,
Consulting & Support

By A. Elein Mustain

27-Apr-2003 Issue: 23

Archives |

General Tidbits |

Google General Bits |

Prev |

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

"Ted" Edgar F. Codd -- 1923-2003
Father of Relational Databases	23-Apr-2003

"Ted" Edgar F. Codd (August 23, 1923 - April 18, 2003)

Born in Dorset, England in 1923, Codd graduated from Oxford University before moving to New York in 1948 to work with IBM. While at IBM, Code wrote the basis of relational theory for databases which has been widely embraced by commercial enterprises and database computer scientists everywhere. His contribution to database technology is unparalled by any other work by any other person.

Working with C.J.Date and others, Codd's many publications are the foundation for relational database theory. He will be greatly missed, but not easily forgotten. His work lives on.

In an panel discussion in February entitled "Database Pioneers Ponder Future", Michael Stonebraker said, "Ted Codd's contributions don't come along everyday. I don't see another thing coming at that level anytime soon." Paul Krill, 2/11/03 InfoWorld Article: Database Pioneers Ponder future.

Note: Original text of this article was changed to exclude a quote from wikipedia due to their licensing requirements.

Contributors: elein at varlena.com, paul krill at infoworld.com,

Running Aggregation
Hints from General Bits 23-Apr-2003

Aggregates are not appropriate for running sums as they are implemented in PostgreSQL. However, by using plpython and a little thoughtfulness you can create simple or elaborate running counters, sums, differences between rows and whatever other useful function you want to run on a running set of data.

PlPython has a dictionary called SD which is local to the function and the connection. This can be used to store values between call iterations. For example, a simple counter would be:

	create or replace function simple_counter()
	returns integer as
	'
		try:
			SD["currval"] += SD["currval"]
		except:
			SD["currval"] = 1
		return SD["currval"]
	' language 'plpython';

As you can see, the first time the function is called the current value will be set to 1. Otherwise it is incremented. You might think we are done. But we are not.

The scope of the current value is the *connection* and the function. Different functions have separate dictionaries so you do not have to worry about names clashing between functions. However, calling the same function on the same connection will use the same dictionary.

If you run two sequential queries using the simple_counter() function, the second will simply pick up where the first left off.

	select simple_counter(), data from mytable where ...;

	Returns
		1  value
		2  value
		3  value
		....
		24 value

Now if you run the query again using either the same or a different table, it will look like this:

	select simple_counter(), otherdata from othertable where...;

	Returns
		25 value
		26 value
		27 value
		...

The value of SD["currval"] remains in existence from the first assignment through the closing of the connection. It can get pretty interesting if the same function is called in several places in the original query.

One workaround to this problem is to change the function to allow initialization of the current value. The initialization could be hard coded as it is in simple_counter or passed in at initialization time.

	-- SCOPE is Connection
	create or replace function simple_counter(integer )
	returns integer as
	'
		if args[0] != 0:
			SD["nextno"] = args[1]
		try:
			SD["nextno"] += 1
		except:
			SD["nextno"] = 1
		return SD["nextno"]
	' language 'plpython';

The next problem is when to initialize the current value. In general you would like it initialized before each SQL statement.

The approach taken to solve the initialization timing is a brute force approach: call the function with flags to denote the action to take. Flag initialization (0 in this case) values before using it, flag increment (1 in this case) to do the incrementation. (This solution does not work for multiple invocations of the function within one statement.) If anyone has a better solution, please let me know!

	select simple_counter(1);
	select data, simple_counter(0) from mytable where ...;

Notice that with a second parameter to the function, it would be easy to use it to be the increment step instead of hard coding one.

This basic structure of a plpython function like simple_counter() which saves information from row to row can be expanded in order to do tasks many other similar tasks. For example you may want to make a calculation based on the value in the current row and a corresponding value in the previous row. column value. This would be done by passing in the current value, having previously saved the previous value. Do the calculation, save the current value and return the calculated result.

Key concepts in any kind of aggregation are

Initialization
Iteration
Final result

Contributors: elein at varlena.com with a nod to dg

How many rows were affected?
[GENERAL] pl-pgsql question 24-Apr-2003

In pl/pgsql, you can find out how many rows were affected by a statement by using GET DIAGNOSTICS immediately after the statement in question.

This is how the statement is used with ROW_COUNT. With ROW_COUNT, the statement before the GET DIAGNOSTICS must be an INSERT, UPDATE or DELETE.

	DECLARE
		rowcount	integer;
		...
	BEGIN
		[INSERT | UPDATE | DELETE] ....
		GET DIAGNOSTICS rowcount = ROW_COUNT;

The GET DIAGNOSTICS statement can also be used with LAST_OID to fetch the OID of the last INSERT statement, if it had an OID. If the table did not have OIDS, it returns 0.

Contributors: Vilson farias vilson.farias at digitro.com.br, Darren Ferguson darren at crystalballinc.com, Dennis Gearon gearond at cvc.net

Getting more information from psql
[GENERAL] Simple question for psql 23-Apr-2003

The first things you should know about psql are \? and \h. The former gives you help on all of the informational functions that psql can provide. The latter will give you brief SQL syntax on the command of your choice.

The \d commands in psql have recently been improved. Most provide word completion. The format of the output of some of these commands has been changed in order to accommodate the addition of schemas, but a + option to get even more information has been added.

From psql \?:

 \d [NAME]      describe table, index, sequence, or view
 \d{t|i|s|v|S} [PATTERN] (add "+" for more detail)
                list tables/indexes/sequences/views/system tables
 \da [PATTERN]  list aggregate functions
 \dd [PATTERN]  show comment for object
 \dD [PATTERN]  list domains
 \df [PATTERN]  list functions (add "+" for more detail)
 \do [NAME]     list operators
 \dl            list large objects, same as \lo_list
 \dp [PATTERN]  list table access privileges
 \dT [PATTERN]  list data types (add "+" for more detail)
 \du [PATTERN]  list users

To see, for example, the source code for function myfunc use:

	me=# \df+ myfunc

You will only see source code for pl functions. You will not see source code (this way) for built-in functions.

Contributors: Hadley Willan hadley.willan at deeperdesign.co.nz, Heath Tanner heath at inligo.com, Alvaro Herrera alvherre at dcc.uchile.cl, elein at varlena.com

Transactions on DDL
[SQL] create/alter/drop within transactions? 25-Apr-2003

DDL statements are Data Definition Language statements such at CREATE, ALTER and DROP.

All statements, including DDL, can be in block transactions and therefor rolled back on error. The only statement that is not in a transaction and therefore cannot be rolled back is TRUNCATE TABLE. However, look forward to 7.4 when TRUNCATE TABLE can also be in a transaction.

You should be cautious about doing many DDL statements inside of a transaction block because those statements lock the entire tables involved. This could very easily cause deadlocks when other transactions are active.

Contributors: Forest Wilkinson lyris-pg at tibit.com, Tom Lane tgl at sss.pgh.pa.us

Assignment of function values in plpgsql
[GENERAL] Storing sequence numbers for later use 18-Apr-2003

In order to assign the value of a function call in pl/pgsql, you must use SELECT INTO instead of an assignment. Assignments are for values and variables. Functions, on the other hand, need to be executed by the engine in order to produce results.

	wrong: myval := currval('mytable_col_seq');

	right: SELECT INTO myval SELECT currval('mytable_col_seq');

In the example in the thread, a trigger function wanted to capture currval and reuse in in two additional insert statements. In this case it is not really necessary to save the value because within the same trigger, currval() always returns the same value. But it is a toss up as to which way it is coded. Any speed gained by only calling currval() once is very small. It is best coded in the most readable format.

	SELECT INTO myval select currval('mytable_col_seq');
	INSERT INTO thattable values (..., myval, ...);
	INSERT INTO thistable values (..., myval, ...);
	OR
	INSERT INTO thattable values (..., currval('mytable_col_seq'), ...);
	INSERT INTO thistable values (..., currval('mytable_col_seq'), ...);

Both methods are equivalent.

Contributors: Markus Heinz Markus.Heinz at web.de, Nigel J. Andrews nandrews at investsystems.co.uk, elein at varlena.com

Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!

Top