Varlena, LLC | PostgreSQL General Bits Newsletter

PostgreSQL Training,
Consulting & Support

By A. Elein Mustain

11-July-2004 Issue: 79

Archives |

General Tidbits |

Google General Bits |

Prev |

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

Setting Free Space Map sizes
Re: Index Problem?	16-Apr-2004

The two Global User Configuration variables, max_fsm_pages and max_fsm_relations are a little mysterious. But setting these Free Space Mapping values can help your queries run faster and more efficiently by better utilizing your memory.

To help understand how you should set these variables you can use VACUUM VERBOSE to tell you what it thinks you need and what you have them set to. At the very bottom of the VACUUM VERBOSE you should see:

   INFO: free space map: 136 relations, 25014 pages stored; 22608 total pages needed
   DETAIL: Allocated FSM size: 1000 relations + 20000 pages = 178 kB shared memory.

The INFO portion tells you what the server will use and what it needs. The DETAIL portion tells you how you have your GUC variables set.

It is recommended that you set the max_fsm_pages and max_fsm_relations a bit higher than the INFO portion tells you. In the example above, changing max_fsm_pages from 20000 to 40000 decreased the time for a particularly bad query by about 30%. But as always, setting things too very high is also costly. Be careful.

To change the max_fsm_pages and max_fsm_relations edit the $PGDATA/postgresql.conf file and bounce (start and stop) the server.

Contributors: Ron St-Pierre rstpierre at syscor.com, Josh Berkus josh at agliodbs.com, Tom Lane tgl at sss.pgh.pa.us, mr_bubbs at #postgresql

New Improved PL/Perl
PL/Perl with Rows, Sets and Stored Memory 11-July-2004

The 7.5 release will include the new and improved plperl. The extensions to plperl in this release will include the ability to write trigger functions, make database queries, return sets and rows and sets of rows as well as store data local to the connection.

In Issue #78 trigger functions and database queries was covered. In this issue examples will show how to return rows, sets of rows and how to store data across function calls.

Returning Rows

To return a row from a PlPerl function, simply return a hash describing that row. Very simple. In this example, we will modify the mmav() function from Issue #78 by creating a type called mmav and then changing only two lines from the original function. The changed lines are the return type definition and the call to return.

   create type mmav AS ( min integer, max integer, avg float );
   create or replace function mmav(text,text )
   returns mmav as
   '
   my $tble = $_[0];
   my $col  = $_[1];
   my $qry  = ''select ''.$col.'' from ''.$tble;
   my $min  = 0;
   my $max  = 0;
   my $sum  = 0;

   my $rv = spi_exec_query( $qry );
   if (@{$rv->{rows}} == 0 ) { return NULL };
   for ( my $i=0; $i < @{$rv->{rows}} ; $i++ ) {
      if ( $i == 0 ) {
         $sum = $max = $min = $rv->{rows}[$i]->{$col};
         next;
      }
      if ( $max < $rv->{rows}[$i]->{$col} )
         { $max = $rv->{rows}[$i]->{$col}; }
      if ( $min > $rv->{rows}[$i]->{$col} )
         { $min = $rv->{rows}[$i]->{$col}; }
      $sum += $rv->{rows}[$i]->{$col};
   }
   return { (min => $min, max => $max, avg=>$sum/@{$rv->{rows}} };

   ' language 'plperl';

Returning Sets of Rows

To return set of rows in PlPerl, you can return an array of hashes. You must build the array and then return the entire array by reference. This is different from plpgsql where you return one row at a time in a loop using RETURN NEXT;

In this example, we have a table of rugby matches. The table contains (team1 text, team2 text, score1 integer, score2 integer) showing the two teams that played and their scores. What we want is to know the average score for each team. This is slightly tricky because the teams could be in either column team1 or team2 and we don't want to go through the table twice. Hashes are good for these kinds of things.

We use a hash to sum up and count the scores and then use the keys of that hash to generate the array of hashes we want to return. And of course, we define the row type ahead of the function.

Note that in this example we are using the dollar quoting which will also be available in 7.5. This allows you to set the quoting characters to your choice. More on this subject will be written in another issue, however, with PlPerl, this feature is particularly handy.

create type team_avg AS ( team text, avg_score float );

create or replace function team_avg()
returns setof team_avg as
$$
my %ta;
my $avg;
my @rows;
my @teams;
my $qry = 'select team1, score1, team2, score2 from tmatches';
my $rv = spi_exec_query ($qry );

   for ( my $i=0; $i < @{$rv->{rows}} ; $i++ ) {
      $ta{ $rv->{rows}[$i]->{'team1'} }{'score'} +=  $rv->{rows}[$i]->{'score1'};
      $ta{ $rv->{rows}[$i]->{'team1'} }{'gcount'} +=  1;
      $ta{ $rv->{rows}[$i]->{'team2'} }{'score'} += $rv->{rows}[$i]->{'score2'};
      $ta{ $rv->{rows}[$i]->{'team2'} }{'gcount'} +=  1;
   }
   @teams = keys %ta;
   for ( my $i=0 ; $i < @teams ; $i++) {
      push @rows, { 'team'=>$teams[$i],
         'avg_score'=>$ta{$teams[$i]}{'gcount'} == 0 ? 0 : \
            $ta{$teams[$i]}{'score'}/$ta{$teams[$i]}{'gcount'} };

   }
   return \@rows;
$$
language 'plperl';
select sname, avg_score from teams t, team_avg() a where t.sname=a.team;

Using Global Memory

PlPerl gives you access to a shared hash which is local to the connection. This can be very handy to store connection specific values. For example, the following two functions get and put arbitrary hashes and values into the the shared hash.

create or replace function put_conn_values (text, text)
returns void as
$$
   $_SHARED{$_[0]}=$_[1];

$$ language 'plperl';

create or replace function get_conn_values (text)
returns text as
$$
   return $_SHARED{$_[0]};

$$ language 'plperl';

select put_conn_values( 'user','jane');
select get_conn_values( 'user');

A couple of important things to remember about this feature are that the memory will not be freed until the connection is freed. And that if you can the function to write values several times in the same connection, the values in $_SHARED will be overwritten by the latest function which sets the values. This is not a problem with this example. However, using the $_SHARED hash for aggregates, where the aggregate function is used more than once in a statement will cause confusion. Both invocations will be writing to the same buckets. The code for these examples can be found here.

Editor: elein at varlena.com andrew at dunslane.net david at fetter.org

Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!

Top