AE   General Bits  
By A. Elein Mustain   30-Jun-2003 Issue: 32

   
General Bits is a column based on the PostgreSQL mailing list pgsql-general. To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.
 
rainbow
Esta Edição em Português
Archives
General Tidbits
Artigos Português
Google General Bits
Notices
To receive email notices of new issues of General Bits, email Elein.
Subscriptions Support General Bits
Ads
ae Consulting PostgreSQL Design & Implementation
pghoster.com
OSCON Speaker
 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Correction in fixdupes() function
Identifying and Removing Duplicate Rows 13-Jun-2003

There was an editing mistake in the Issue 30 article about finding and fixing duplicate rows. (Why didn't someone else find this and tell me?)

The function fixdupes() had an incorrect targetlist for the selection in the function. The corrected function in full is:

   create or replace function fixdupes()
   returns void as '
      DECLARE
         d record;
      BEGIN
         for d in select * 
               from rawlog 
               group by ldate, ltime, doc, ip, method, qs, code, dsize
               having count(*) > 1 limit 1
         loop
               delete from rawlog  
               where ldate=d.ldate and ltime=d.ltime and 
               doc=d.doc and ip=d.ip and method=d.method and
               qs=d.qs and code=d.code and dsize=d.dsize;

               insert into rawlog values (d.ip, d.ldate, d.ltime,
               d.method, d.doc, d.qs, d.code, d.dsize);
         end loop;
      RETURN;
   END;
   ' language 'plpgsql';

Contributors: elein at varlena.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Trusted and Untrusted Languages
[GENERAL] PlPython (Was: plpython? (Was: Re: Damn triggers and NEW)) 18-Jun-2003

PostgreSQL supports trusted and untrusted languages for server side functions. As a general rule (except for C) untrusted languages have a u suffix to their name. For example, plperl is trusted but plperlu is an untrusted.

A function written in a trusted language, a trusted function, is safe for general usage. A function written in an untrusted language, an untrusted function, can also be written safely, but the possibility exists for abuse.

Tom Lane writes, "[a] trusted-PL language should be able to define arbitrary self-contained computations (arithmetic, pattern-matching, or what have you), and it should be able to access the database at the same level as regular SQL commands. It should not be able to bypass the SQL abstractions nor execute any OS-level operations using the postgres user's privileges." Trusted languages explicitly limit their functionality via permissions and by language interpreters to prevent invoking "dangerous" functionality.

So an untrusted language is not, as one might think, a buggy or bad implementation of the language, but rather one which allows expanded functionality to be performed in a server function under the permissions of the postgres superuser.

That "expanded" or "dangerous" functionality is widely defined as anything that operates outside of the database on the understanding that anything outside of the database could be exploited to cause damage. File system I/O, many specific OS calls, memory management, process creating and interprocess communication are some of the things that are classified as dangerous.

Untrusted languages must be explicitly installed by the database superuser and permission to create functions and triggers using untrusted languages is limited to users with superuser priviledge. The responsibility for good citizenship is placed on the superuser(s).

After understanding all of the warnings, untrusted functions can be extremely important to an application. There are many legitimate and valuable cases where untrusted languages and operations are extremely useful, for example, sending email, safe file system reads and writes and informational OS calls. It is also possible to write a function in an untrusted language that does not use any of the untrusted features, for example most C functions available in contrib fall into this category.

Untrusted does not mean that the language is not to be used. It means that it must be used carefully. An untrusted language can be used very beneficially by someone who understands exactly the repercussions of their code.

Contributors: scott.marlowe scott.marlowe at ihs.com, Karsten Hilbert Karsten.Hilbert at gmx.net, Mikhail Terekhov terekhov at emc.com, Tom Lane tgl at sss.pgh.pa.us, DeJuan Jackson djackson at speedfc.com, Kevin Jacobs jacobs at penguin.theopalgroup.com, Doug McNaught doug at mcnaught.org, elein at varlena.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Plpython changes to an Untrusted Language
[GENERAL] PlPython (Was: plpython? (Was: Re: Damn triggers and NEW)) 18-Jun-2003

In 7.4, plpython will move from a trusted language to an untrusted language and its name will change to plpythonu. The name change emphasizes that the language is now untrusted. (See the above article in this issue for an explanation of trusted and untrusted languages.)

Guido Van Rossum, et al, the Python team, have determined that rexec (restricted execution) was fundamentally insecure in python and so it was removed from language as of version 2.3.

The earlier version of plpython enabled a limited set of python calls via rexec. With the new understanding of the frailities of rexec, plpython was then insecure, not having a safe, restricted execution environment and was also incompatible with python 2.3.

Kevin Jacobs submitted a patch to plpython removing the rexec references and marked the language as untrusted. The patch was applied to the development branch targeted for 7.4.

People using plpython will probably need to reload their functions with the language changed to plpythonu in 7.4. Check the release notes for proper handling when the 7.4 version comes out. Also only people with superuser priviledges will be able to define functions using plpythonu.

Contributors: Kevin Jacobs jacobs@penguin.theopalgroup.com, Bruce Momjian pgman@candle.pha.pa.us, Tom Lane tgl@sss.pgh.pa.us, Mikhail Terekhov terekhov@emc.com, Karsten Hilbert Karsten.Hilbert@gmx.net, elein at varlena.com, scott.marlowe scott.marlowe@ihs.com, Doug McNaught doug@mcnaught.org, Ron Johnson ron.l.johnson@cox.net, Jason Earl jason.earl@simplot.com

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Rowtypes Keep Dropped Columns
[GENERAL] plpgsql, rowtype and dropped columns 11-Jun-2003

ROWTYPEs of tables with dropped columns can be frustrating and confusing. If you declare a variable to be the rowtype of your table and select into it, the selection set and the target rowtype will be different in that the row type will still contain any dropped columns.

A row type with dropped columns can viewed using \d and the dropped column is not shown. However, by selecting from the pg_attribute table you can see where the dropped column was. It would look like this:

 attnum |           attname            |  typname  |      case      
--------+------------------------------+-----------+----------------
     -7 | tableoid                     | oid       | '-7'::oid
     -6 | cmax                         | cid       | '-6'::cid
     -5 | xmax                         | xid       | '-5'::xid
     -4 | cmin                         | cid       | '-4'::cid
     -3 | xmin                         | xid       | '-3'::xid
     -2 | oid                          | oid       | '-2'::oid
     -1 | ctid                         | tid       | '-1'::tid
      1 | col_int                      | int4      | '1'::int4
      2 | col_timestamp                | timestamp | '2'::timestamp
      3 | col_float                    | float8    | '3'::float8
      4 | col_text                     | text      | '4'::text
      5 | ........pg.dropped.5........ | text      | 
      6 | mod_timestamp                | timestamp | '6'::timestamp
(13 rows)

A function like this one will not work as expected. The * will select attributes 1, 2, 3, 4, and 6. However, the rowtype will have attributes 1, 2, 3, 4, 5 and 6, causing a mismatch on the return types.

   CREATE OR REPLACE FUNCTION testrows ( integer )
      RETURNS foo
      AS '
      DECLARE
         myrec   foo%ROWTYPE;
      BEGIN

         SELECT INTO myrec *
         FROM foo 
         WHERE foo.col_int = $1;

         RETURN myrec;
      END;
   '
   LANGUAGE 'plpgsql';

   select * from do_testrows(2);
   WARNING:  Error occurred while executing PL/pgSQL function testrows
   WARNING:  line 5 at select into variables
   ERROR:  Bad timestamp external representation '2.3'

If you do not get an error on mismatched returned types, you will get the following error:
ERROR:  Function bartest() does not exist
        Unable to identify a function that satisfies the given argument types
        You may need to add explicit typecasts

This behaviour is a known problem. The workaround for this problem is to use a RECORD type instead of a ROWTYPE or possible designate your target list explicitly, including values for the "missing" columns or define a ROWTYPE which matches your altered table and use it instead of the table's ROWTYPE.

The fix to this bug would require changes to all places in the code where rowtypes are accessed. This is a lot of places. No one has yet determined a way to correct the problem as a universal change.

In the meanwhile, use caution when using ROWTYPES with altered tables.

Contributors: Nigel J. Andrews nandrews at investsystems.co.uk Tom Lane tgl at sss.pgh.pa.us

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Dropping plpgsql functions using indexes
[GENERAL] Odd error message 16-Jun-2003

plpgsql functions create internal dependencies on objects used it its queries. Objects include indexes, other functions, tables and views.

When a plpgsql function is first run a plan is created and stored for queries within the function. This enable plpgsql to be fairly speedy.

If the plan has dependent objects the function will not run. You might think that dropping an index should not affect the function, however, due to the stored query plan it does.

If a query in the plpgsql function stores a plan that uses an index AND that index is dropped then running the function will cause an error about a missing relation:

	ERROR:  Relation 3912941 does not exist

Dropping and recreating the function will repair the problem. The plan is dropped when the function is dropped and a new plan is saved when it is run again.

Contributors: nolan at celery.tssi.com, Richard Huxton dev at archonet.com


Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!.
Copyright A. Elein Mustain 2003

Google
Search General Bits & varlena.com Search WWW