Varlena, LLC | PostgreSQL General Bits Newsletter

PostgreSQL Training,
Consulting & Support

By A. Elein Mustain

19-May-2003 Issue: 26

Archives |

General Tidbits |

Google General Bits |

Prev |

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

New Link: General Tidbits
Editorial	19-May-2003

A new feature link, General Tidbits, has been added to the Archive page. This is a link to information that stands alone outside of an article and the link enables this and other information, as it is becomes available, to be permanently added to the General Bits site.

The code for the examples in the following several items and the timezone table discussed in the last item are the first pieces of this new page. I'm sure it will grow. If you have information that you think should be available, let me know.

Editor: elein at varlena.com

RowTypes and Functions Returning Sets
with Stephan Szabo 15-May-2003

PostgreSQL 7.3 now supports a much more flexible system for writing set returning functions (SRFs) that when combined with some of the new function permission options allow a greater flexibility in setting up schemas.

Since the ability to create functions which return sets and rowtypes and sets of rowtypes was added to PostgreSQL 7.3 I have seen many, many requests for examples. Stephan Szabo wrote a very nice document on TechDocs which explains how it works. With Stephan's permission I have expanded and rearranged his original document into the several items below.

These items are divided into the following sections:

Returning Sets
Returning Rowtypes
Returning Sets of Rowtypes

The examples which follow each use the following simple data set. All of the code for this dataset, the functions and sample queries can be downloaded from here.

create table department(
   id int primary key,
   name text);

create table employee(
   id int primary key,
   name text,
   salary int,
   departmentid int references department);

insert into department values (1, 'Management');
insert into department values (2, 'IT');

insert into employee values (1, 'John Smith', 30000, 1);
insert into employee values (2, 'Jane Doe', 50000, 1);
insert into employee values (3, 'Jack Jackson', 60000, 2);

This data will produce the following data set:

	select e.id as "Emp Id", e.name as "Emp Name", e.salary as "Salary",
		d.id as "Dept Id", d.name as "Dept Name" 
	from employee e , department d 
	where e.departmentid = d.id;

	 Emp Id |    Emp Name     | Salary | Dept Id | Dept Name  
	--------+-----------------+--------+---------+------------
	      1 | John Smith      |  30000 |       1 | Management
	      2 | Jane Doe        |  50000 |       1 | Management
	      3 | Fairlie Reese   |  63000 |       1 | Management
	      4 | Jack Jackson    |  60000 |       2 | IT
	      5 | Harold Bibsom   |  40000 |       2 | IT
	      6 | Julio Garcia    |  70000 |       2 | IT
	      7 | Bernice Johnson |  55000 |       2 | IT
	      8 | Lily Leong      |  67000 |       2 | IT
	      9 | Abby Wood       |  57000 |       2 | IT
	     10 | Jeff Jeffries   |  52000 |       2 | IT
	     11 | Geordie O'Hare  |  42000 |       2 | IT
	(11 rows)

Share and Enjoy,
Elein

Editor: elein at varlena.com

Returning Sets
How do I return a SETOF? 15-May-2003

Functions which return sets enable you to return a series of values. They are invoked from within the FROM clause regardless of what they return. They are not called directly from the target list or the where clause. They can be called only from whereever a table name could be positioned.

	SELECT * FROM mySetFunction();

The case of returning an ordinary type is not often used, however, it is a good illustration of the mechanics of a function which returns sets. When returning a generic type all of the functionality of plpgsql and other functions and operators are available to construct the return value. However, this example is very simple and would not otherwise be bothered with.

This function shows the department information for employees with salaries greater than the amount specified. The key parts for returning a set are in bold.

   CREATE OR REPLACE FUNCTION getemployeedid ( integer )
      RETURNS SETOF integer as '
      DECLARE 
         myrow RECORD;
         retval integer;
      BEGIN
         FOR myrow IN SELECT * FROM employee where salary >= $1 LOOP
            RETURN NEXT myrow.departmentid;
         END LOOP;
         RETURN;
      END;
   ' language 'plpgsql';

The function is declared as a usual plpgsql function except the return type is a SETOF the return type. It uses the LOOP construct to select the values. But notice that the function has a RETURN NEXT before the end of the loop. At that point it will return the current value and wait until it is called again in the same statement. Let's look at some of the SELECT statements that might be used with this function:

	--
	-- All employee department numbers
	--
	elein=# select * from getemployeedid(0);
	 getemployeedid 
	----------------
	              1
	              1
	              2
	(3 rows)
	--
	-- Each row represents on person in the department making over 50000.
	-- Equivalent to 
	-- select d.id, d.name from employees e, department d
	-- where e.departmentid = d.id;
	--
	elein=# select id, name from getemployeedid(50000) e, department d where e = id;
	 id |    name    
	----+------------
	  1 | Management
	  2 | IT
	(2 rows)

	--
	-- How many employees make over 50000 in each department
	--
	elein=# select count(*), g from getemployeedid(50000) g group by g;
	 count | g 
	-------+---
	     2 | 1
	     6 | 2
	(2 rows)

In summary, to return a set from a plpgsql function, define the function to return SETOF and use RETURN NEXT inside of a LOOP producing each value.

Contributors: elein at varlena.com, Stephan Szabo sszabo at megazone23.bigpanda.com

Defining and returning Rowtypes
How do I return a rowtype? 15-May-2003

The ability to return rowtypes was added to PostgreSQL at the same time as the ability to return sets. In this example we'll show how to return a tuple containing department total salaries. It will return the minimum department total salary, the maximum department total salary and the average department total salary. (This is a common triplet for average aggregates which clarifies the accuracy of an average. Sometimes the median is included, too. This lets you know when you are averaging the CEO's salary with yours...)

In order to create a function returning a rowtype, you must first create a named rowtype describing the output tuple of the function. (Side Note: Informix supported unnamed rowtypes. PostgreSQL does not at this time.) The syntax for creating a rowtype is similar to that of creating a table.

CREATE TYPE deptavgs AS ( minsal int, maxsal int, avgsalary int8);

Once the rowtype deptavgs is defined, you can write a function which returns that type. The function below selects and sums the total salary for each department and keeps a counter on the number of rows in order to calculate the average. It also checks each row to see if it is the first row, the maximum total salary or the minimum total salary in order to find the minimum total department salary and the maximum total department salary.

The parts of this function which are specific to returning rowtypes are the declaration of the variable which will be returned, r, and the construction of the returning tuple. A rowtype is declared with its type name followed by a percent followed by the word "rowtype". The returning tuple is referenced and accessed with the usual "dot" notation in the same way a table might be.

   CREATE or REPLACE FUNCTION avgdept() RETURNS deptavgs AS
   '
   DECLARE
      r deptavgs%rowtype;
      dept record;
      bucket int8;
      counter int;
   BEGIN
      bucket   := 0;
      counter  := 0;
      r.maxsal := 0;
      r.minsal := 0;
      FOR dept IN SELECT sum(salary) AS salary, d.id AS department
                  FROM employee e, department d WHERE e.departmentid = d.id
                  GROUP BY department LOOP
         counter := counter + 1;
         bucket  := bucket + dept.salary;
         IF r.maxsal <= dept.salary OR r.maxsal = 0 THEN
            r.maxsal := dept.salary;
         END IF;
         IF r.minsal >= dept.salary OR r.minsal = 0 THEN
            r.minsal := dept.salary;
         END IF;
      END LOOP;

      r.avgsalary := bucket/counter;

      RETURN r;
   END
   ' language 'plpgsql';

This function will give us the following. Remember it is department totals, not individual salaries being shown.

	elein=# select * from avgdept();
	 minsal | maxsal | avgsalary 
	--------+--------+-----------
	  60000 |  80000 |     70000
	(1 row)

Contributors: elein at varlena.com Stephan Szabo sszabo at megazone23.bigpanda.com

Returning Sets of Rows
How do I return a set of rows? 15-May-2003

Now we get to the interesting part, generating a result set from a function by using a function which returns SETOF a rowtype. As an historical aside, this methodology was used to produce the first generation of gateways in the Illustra product. Using the function as a front end to another database and way too many parameters, we produced rows for the Illustra server that originated in another database.

The following function mimics an aggregate on department. The return values will be the department id, the miminum salary for the department, the maximum salary for the department and the average salary for the department. The rowtype salavgs will be created containing the appropriate columns for these return values. The function will select the salaries for all departments, ordering by department.

The interesting part is how the return values are handled. The RETURN NEXT call returns the appropriate value and when the function is called again, it picks up at the statement following the RETURN NEXT. These functions also require the final RETURN because of this.

The selection loops through all of the employee records to do the aggregation. And since this is an average aggregate we want to return the values when we've looped through all of the records in each department. We know we've done that if the department id changes or we break out of the LOOP. When the department changes and we return the value, the buckets will need to be reinitialized so that the next department data only contains its information. In addition to all of that we also want to know when we've got the first record so we can initialize our buckets at the start.

The type creation and function creation looks like this, with the interesting parts bolded:

	CREATE TYPE salavgs AS
	   (deptid integer, minsal integer, maxsal integer, avgsalary int8);
	CREATE OR REPLACE FUNCTION avgsal() RETURNS SETOF salavgs AS 
	'
	DECLARE
	   s salavgs%ROWTYPE;
	   salrec RECORD;
	   bucket int8;
	   counter int;
	BEGIN
	   bucket   :=0;
	   counter  :=0;
	   s.maxsal :=0;
	   s.minsal :=0;
	   s.deptid :=0;
	   FOR salrec IN SELECT salary AS salary, d.id AS department
	               FROM employee e, department d WHERE e.departmentid = d.id
	               ORDER BY d.id LOOP
	      IF s.deptid = 0 THEN
	         s.deptid := salrec.department;
	         s.minsal := salrec.salary;
	         s.maxsal := salrec.salary;
	         counter  := counter + 1;
	         bucket   := bucket + salrec.salary;
	      ELSE
	         IF s.deptid = salrec.department THEN
	            IF s.maxsal <= salrec.salary THEN
	               s.maxsal := salrec.salary;
	            END IF;
	            IF s.minsal >= salrec.salary THEN
	               s.minsal := salrec.salary;
	            END IF;
	            counter := counter + 1;
	            bucket  := bucket + salrec.salary;
	         ELSE
	            s.avgsalary := bucket/counter;
	            RETURN NEXT s;
	            s.deptid := salrec.department;
	            s.minsal := salrec.salary;
	            s.maxsal := salrec.salary;
	            counter  := 1;
	            bucket   := salrec.salary;
	         END IF;
	      END IF;
	   END LOOP;
	   s.avgsalary := bucket/counter;
	   RETURN NEXT s;
	   RETURN;
	END '
	LANGUAGE 'plpgsql';

This gives us the following rows:

	--
	-- Department salary averages
	--
	select * from avgsal();
 	 deptid | minsal | maxsal | avgsalary 
	--------+--------+--------+-----------
       1 |  30000 |  63000 |     47666
       2 |  40000 |  70000 |     55375
 (2 rows)

If we joined this back to the department table we would get:

	elein=# select d.name, a.minsal, a.maxsal, a.avgsalary
	elein-# from avgsal() a, department d
	elein-# where d.id = a.deptid;
    	name    | minsal | maxsal | avgsalary 
	------------+--------+--------+-----------
 	Management |  30000 |  63000 |     47666
 	IT         |  40000 |  70000 |     55375
	(2 rows)

This technique is great for generating dynamically constructed rows that for some reason cannot be created in a subselect. Fanciful aggregate information like the above is done in one loop. You are only limited by what you can do in plpgsql and that is a very soft limit.

Contributors: elein at varlena.com,

Valid Timezones in PostgreSQL
Timezones 17-May-2003

THIS ARTICLE IS OBSOLETE DUE TO TIMEZONE CHANGES IN 8.0.

Are you tired of fishing through src/backend/utils/adt/datetime.c for the list of valid timezones? Here is a TimeZone Table based on datetime.c from data extracted by Aditya. It shows all of the valid timezones in PostgreSQL.

Contributors: Aditya aditya at grot.org, elein at varlena.com

Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!

Top