varlena
varlena
PostgreSQL Training,
Consulting & Support
General Bits
By A. Elein Mustain

08-Dec-2003 Issue: 53

Archives | General Tidbits | Google General Bits | Docs | Castellano | Português | Subscriptions | Notifications | | Prev | Next

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

Publication Delay
Issues 53 and 54 12-Dec-2003

Due to a technical glitch (doh!) Issue 53 was not published until later in the week instead of its usual Monday date.

Due to this late publication and that I will be out of town early next week, Issue #54 is postponed by a week. I am sorry for the inconvenience.

Editor: elein at varlena.com
Why use Cursors?
[GENERAL] are cursors necessary? 5-Dec-2003

In the program testlibpq.c, these five SQL statements are executed to illustrate the basics of how cursors work.

   res = PQexec(conn, "BEGIN");
   res = PQexec(conn, "DECLARE myportal CURSOR FOR select * from pg_database");
   res = PQexec(conn, "FETCH ALL in myportal");
   res = PQexec(conn, "CLOSE myportal");
   res = PQexec(conn, "END");
This is the equivalent of
	res = PQexec(conn, "select * from pg_database");
So why would we want to execute five statements instead of one?

If the data returned by the query is relatively small or if you want all of the data at once, then, the single query is the best option. Your entire result set will be selected and sent to the client at once.

If you want to scroll through the data set without downloading the whole set, or if you want to download the data piecemeal, then a cursor is what you want to use.

Cursors can save client and network bandwidth. If you want to show a page full of data at a time, and enable your application to scroll up or down in the data, then the cursor is the right thing to use. However, to get the whole data set fastest, use the plain select.

While this example is with libpq, the same applies for all client interfaces.

Jan Weick: You really think people would ever want to store more than 640 rows?
Mike Mascari: Ha ha! With each one being 1K?

Contributors: Mark Harrison mh at pixar.com Richard Huxton dev at archonet.com Jan Wieck JanWieck at Yahoo.com Mike Mascari mascarm at mascari.com Alex Satrapa alex at lintelsys.com.au
Brief review of 7.4 gotchas
Changes in 7.4 06-Dec-2003

In previous issues, various 7.4 feature changes were discussed. In Issue #48 the changes to postgresql.conf were discussed for example. This list is from 7.4 Documentation Appendix E.1.2. Migration to version 7.4 and highlights changes which might be puzzling or problematic. The Release Notes are required reading.

  • The server-side autocommit setting was removed and reimplemented in client applications and languages. autocommit is no longer available in postgresql.conf.
    To turn autocommit off, use \SET AUTOCOMMIT OFF in psql.
  • Error message wording has changed substantially in this release, and error codes have been added.
    If any of your applications parse the error messages, then it is imperative that you consult the new error code documentation in the Appendix of the PostgreSQL 7.4 documentation. The error codes now match the SQL standards for error codes. This means that the error code and not the error message text should be checked. The error codes are unlikely to change whereas the error message text may change and may also be different with different languages.
  • ANSI joins may behave differently because they are now better optimized
    You may see different EXPLAIN results with ANSI JOINs. If your vacuuming and statistics are up to date, these queries should be better optimized. If you see a behavior change which results in a much slower query then you must re-examine the "usual suspects." Check the data types in the qualifications and the indexes to be sure that they match, etc.
  • MOVE/FETCH 0 now does nothing; return value of MOVE/FETCH 1/0 varies based on the current offset in the cursor
    Remember that the difference between MOVE and FETCH only is that FETCH returns data and MOVE does not. Both move the CURSOR to an enumerated row in a result set. Previously FETCH 0 was PostgreSQL specific was equivalent to FETCH ALL. You could not use FETCH 0 or FETCH RELATIVE 0 to retrieve the current row. The new behavior is that these commands will now fetch the row the cursor is currently pointing to.
  • COPY now can process carriage-return/line-feed and carriage-return terminated files. Literal carriage-returns and line-feeds are no longer accepted as data values; use \r and \n instead.
    If you are counting on using old files created by COPY and have data which contains literal carriage-returns and line-fees it is time to break out the perl scripts to convert those characters to \r and \n.
  • Trailing spaces are now trimmed when converting from CHAR(n) to VARCHAR(n) / TEXT
    If you use CHAR(n) and are counting on the trailing spaces be sure to not use the explicit or implicit casts to TEXT or VARCHAR(n). TEXT and VARCHAR(n) data types do not keep the trailing blanks.
  • FLOAT(p) now measures 'p' in bits, not digits
    This change now brings PostgreSQL back in line with the SQL standards. Check your float fields if they are defined with precision to ensure you have the right storage declared.
  • 'now' and 'today' will no longer work as a column default; now() should be used instead
    Review any timestamp fields with now or today as a default. Change them to use the correct function call now() or special variables, current_timestamp, current_date, etc.

    Contributors: Tom Lane tgl at sss.pgh.pa.us, Bruce Momjian pgman at candle.pha.pa.us, elein at varlena.com
    Concurrent Updates
    [GENERAL] postgresql locks the whole table! 03-Dec-2003

    Given a table like the one below, two parallel inserts on separate connections using BLOCK transactions are performed. The second insert will be locked out until the first transaction is completed.

    CREATE TABLE take2
    (
      id serial not null,
      timestamp timestamp NOT NULL DEFAULT now(),
      description text,
      FOREIGN KEY (asset_id) REFERENCES public.asset (id),  
      ON UPDATE CASCADE ON DELETE CASCADE,
      PRIMARY KEY(id)
    );
    
    Connection 1 Connection 2
    BEGIN;
    insert into table take2
      values(default, default, 't1');
      BEGIN;
    insert into table take2
      values(default, default, 't2');
    ...other commands...   Locked out
    ...COMMIT;   Unlocks
    idle   ...other commands...
    idle   ...COMMIT;

    When PostgreSQL references public.asset, it creates an exclusive lock on the table--the equivalent of SELECT FOR UPDATE. This exclusive lock can be seen by selecting from pg_locks.

     relation | database | transaction |  pid  | mode             | granted
    ----------+----------+-------------+-------+------------------+---------
        39356 |    34862 |        NULL | 18671 | AccessShareLock  | t
        39356 |    34862 |        NULL | 18671 | RowExclusiveLock | t
         NULL |     NULL |        9914 | 18671 | ExclusiveLock    | t
        39354 |    34862 |        NULL | 18671 | AccessShareLock  | t
        34886 |    34862 |        NULL | 18671 | AccessShareLock  | t
        34886 |    34862 |        NULL | 18671 | RowShareLock     | t
        16759 |    34862 |        NULL | 18671 | AccessShareLock  | t
    (7 rows)
    
    PostgreSQL does not implement shared read locks on the row level. (Ed: yet?)

    To avoid this type of conflict, be absolutely sure that your transactions are a small as possible. Never have user interaction when a transaction is begun. (Humans are so unreliable :-) This advice holds for all kinds of transactions.

    In this particular case, where the transaction is already as small as possible, you can defer the foreign key check until commit time. There are two ways to do this, depending on the actions within your transaction. You can define the FOREIGN KEY to be deferrable or you can SET CONSTRAINTS DEFERRED for the transaction.

       CREATE TABLE take2
       (
         id serial not null,
         timestamp timestamp NOT NULL DEFAULT now(),
         description text,
         FOREIGN KEY (asset_id) REFERENCES public.asset (id) DEFERRED,
         ON UPDATE CASCADE ON DELETE CASCADE,
         PRIMARY KEY(id)
       );
    
    Setting the transaction constraints:
       BEGIN
       SET CONSTRAINTS ALL DEFERRED
       insert into table take2
    values(default, default, 't1'); ... COMMIT;

    Contributors: Dr NoName spamacct11 at yahoo.com, Mike Mascari mascarm at mascari.com, Alistair Hopkins alistair at berthengron.co.uk, Jan Wieck JanWieck at Yahoo.com, Scott Ribe scott_ribe at killerbytes.com, Greg Stark gsstark at mit.edu, Bruce Momjian pgman at candle.pha.pa.us
    Reverse Function
    [SQL] Equivalent of Reverse() functions 27-Nov-2003

    Here are two nice implementations of a function which reverses the character order of a text field. The first, by Joe Conway, wins in simplicity, but requires plperl. The second, by Greg Sabino Mullane, is a nice example of using loops in plpgsql.

    	CREATE OR REPLACE FUNCTION reverse(TEXT) 
    	RETURNS TEXT AS '
    	   RETURN reverse($_[0]);
    	' LANGUAGE 'plperl';
    
    	CREATE OR REPLACE FUNCTION reverse(TEXT) RETURNS TEXT AS '
    	   DECLARE
    	      original ALIAS FOR $1;
    	      reversed TEXT := \'\';
    	      onechar  VARCHAR;
    	      mypos    INTEGER;
    	   BEGIN
    	      SELECT LENGTH(original) INTO mypos;
    	      LOOP
    	         EXIT WHEN mypos < 1;
    	         SELECT substring(original FROM mypos FOR 1) INTO onechar;
    	         reversed := reversed || onechar;
    	         mypos := mypos -1;
    	      END LOOP;
    	      RETURN reversed;
    	   END
    	' LANGUAGE plpgsql IMMUTABLE RETURNS NULL ON NULL INPUT;
    

    Contributors: Kumar sgnerd at yahoo.com.sg, Joe Conway mail at joeconway.com, greg at turnstep.com
    Number of Rows Affected
    [GENERAL] plpgsql question 25-Nov-2003

    To determine the number of rows affected by a statement in a plpgsql function, you can use GET DIAGNOSTICS. This is very handy if you want to check the number of rows from a statement which affects many rows, either directly or indirectly via a trigger or a rule.

       DECLARE
         RowsAffected INTEGER;
       BEGIN
         -- DO your statement
         GET DIAGNOSTICS RowsAffected = ROW_COUNT;
       END
    

    Contributors: Brian Hirt bhirt mobygames.com, Michael A Nachbaur mike@nachbaur.com


  • Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!
    Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009

    Top
    Google
    Search General Bits & varlena.com Search WWW