Varlena, LLC | PostgreSQL General Bits Newsletter

PostgreSQL Training,
Consulting & Support

By A. Elein Mustain

2-Jun-2003 Issue: 28

Archives |

General Tidbits |

Google General Bits |

Prev |

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

One more time: what NULL is not
[GENERAL] Query Which is not supposted to list NULLS is listing	27-May-2003

Contrary to what Oracle and other unnamed database systems say, an empty string, '', is not the same as NULL. If you are used to other systems, it can be easy to make mistakes, though. So if you find that you are selecting rows with NULL values in columns that you do not expect to see, you may have a problem with empty strings in the place of NULLs.

You will need to take a look at the data source to prevent future bad data, find the bad rows and fix them and then write on the chalkboard 100 times "NULL is not the same as an empty string" :-)

To see whether the values are NULL or an empty string people submitted a number of interesting queries. The variety of things people choose to test to find out about the column was surprising. Some tested for NULL, some for NOT NULL, some for an empty string, some for not an empty string and some for an operation over the column. Some of these queries are better than others for this task. I prefer the coalesce() option, myself. But each selection shows a different way to look at the problem and that is interesting. Here are a few of them.

	-- If you get back a row then jobcomment is not null and 
	-- probably has the empty string in it ...
	SELECT jobcomment 
	FROM tblJobIncharge 
	WHERE id=51 AND jobcomment IS NOT NULL;

	-- To make sure that jobcomment is not [an empty string] you can try this query:
	-- If jobcomment is an empty string then you will get '0' in the job comment field,
	-- if it is really null then you will get nothing
	SELECT jassignedid, ascii(JobComments) 
	FROM tblJobIncharge 
	WHERE JobComments IS NOT NULL AND projectid >= 50 AND projectid <= 100

	-- Any results that in that column are null will be displayed as 'X'.
	SELECT jassignedid, coalesce(JobComments,'X') 
	FROM tblJobIncharge
	WHERE JobComments IS NOT NULL AND projectid >= 50 AND projectid <= 100


	-- Which not only will show you real nulls (since null in a || operation gives
	-- null) but also indicate the string between the 'xx' and 'xx'.
	SELECT jassignedid, 'xx' || JobComments || 'xx'
	FROM tblJobIncharge
	WHERE projectid >= 50 AND projectid <= 100


	-- The length of the column will tell how many spaces there are if there are any.
	SELECT jassignedid, JobComments, length(JobComments)
	FROM tblJobIncharge
	WHERE JobComments IS NOT NULL
	AND projectid >= 50 AND projectid <= 100;

Do not forget to check your data sources for misplaced semantics. For example, form based applications will submit empty strings which may be made to be interpreted as NULL. In PostgreSQL this is how you could update based on a blank input.

	UPDATE tblJobIncharge
	SET JobComments = NULL
	WHERE LENGTH(RTRIM(JobComments)) = 0;

Contributors: shreedhar shreedhar at lucidindia.net, Jean-Christian Imbeault jc at mega-bucks.co.jp, Nigel J. Andrews nandrews at investsystems.co.uk, Vincent Hikida vhikida at inreach.com, Andrew Sullivan andrew at libertyrms.info

Views and Row Returning Functions
[GENERAL] are views obsolete ? 19-May-2003

The question was raised whether views would become "obsolete" with the advent of the Set Returning Functions (SRF). (Actually in question are functions returning sets of rows FRSR which are a subset of SRFs :-) (See Issue #26 for information about functions returning sets, functions returning rows and functions returning sets of rows.)

View will not become obsolete due to FRSR for several reasons. The first reasons are that views are well understood and very commonly used. They are easy to construct. People are just learning how to construct FRSRs and they are a new feature. The feature is is incredibly useful but if a view would be adequate a view can and should be used for simplicity's sake if for no other reason.

But there is another reason. SQL is optimized very well by PostgreSQL. And the planner can optimize views because the definition is simply SQL. The planner, however, cannot optimize functions. Functions are opaque to the planner and can result in undeterminable variations so that a plan can not be accurately determined.

SQL functions are also not optimized. However, it could be possible that in the future, SQL functions may be brought inline into the calling SQL statement and then optimized with the rest of the statement. This is not implemented at this time. Contributors: Karsten Hilbert Karsten.Hilbert at gmx.net, Tom Lane tgl at sss.pgh.pa.us, Gianni Mariani gianni at mariani.ws, elein at varlena.com

Cursor positioning after fetch
[GENERAL] FETCH ALL positions cursor strangely? 08-May-2003

When using CURSORs you need to understand where the CURSOR lands after doing a FETCH.

The FETCH syntax is:
	FETCH [direction] [count] { FROM | IN } cursor

The primary rule is that the cursor lands on the last row returned. If the cursor is at ROW[n], then the relative fetch begins on the next row, ROW[n+1]. The exception to this rule is when you perform a FETCH ALL. That statement leaves the cursor one past the LAST row, prepared to do a reverse fetch.

STATEMENT	ROW
[0]	[1]	...	[n-1]	[n]	[n+1]	...	[LAST]	[LAST+1]
FETCH 1 FROM cursor				start	end
FETCH - 1 FROM cursor			end	start
FETCH #rows FROM cursor							end
FETCH ALL FROM cursor								end

Contributors: Bill Gribble grib at linuxdevel.com, Tom Lane tgl at sss.pgh.pa.us

Constraints on Array Elements
[GENERAL] Help with array constraints 03-Apr-2003

An array column in a table sometimes needs constraints. In general, it is not possible to create a constraint on individual elements of an array except by using a constraint function.

In the case in this thread, the elements of the array letters must be in the id column of table lettervals. This was the initial try. It will NOT work:

	create table foo (
		letters	char(1)[3] REFERENCES lettervals(id)
		...

This doesn't work because it is trying to reference the full array value to a single letter value. A full array and a single letter are not the same type.

Nigel Andrews suggested the following trigger function. This function loops over the array and checks whether the array element is in the lettervals table. If it is not, an error is raised aborting the INSERT.

	CREATE FUNCTION letter_fk () RETURNS TRIGGER AS '
	DECLARE
		ind	integer;
	BEGIN
		FOR ind IN array_dims(NEW.letters) LOOP
			PERFORM 1 FROM lettervals WHERE id = NEW.letters[ind];
			IF NOT FOUND THEN
				RAISE EXCEPTION
					''My foreign key constraint violation'';
			END IF;
		END LOOP;
		RETURN NEW;
	END;
	' AS LANGUAGE 'plpgsql';

	CREATE TRIGGER lettercheck BEFORE INSERT ON foo
	FOR EACH row
	EXECUTE PROCEDURE letter_fk();

Joe Conway reminds us that he is working on a fairly major upgrade to array support for 7.4 which will move some of the contrib array functionality into the backend. We have seen some of his functions in past issues of General Bits dealing with arrays and look forward to this addition in 7.4.

Contributors: Jason Hihn jhihn at paytimepayroll.com, Antti Haapala antti.haapala at iki.fi, Richard Huxton dev at archonet.com, Nigel J. Andrews nandrews at investsystems.co.uk, Joe Conway mail at joeconway.com, Stephan Szabo sszabo at megazone23.bigpanda.com

Loading Dubious Data
[GENERAL] implicit abort harmful? 24-May-2003

When doing bulk loading, sometimes the data is bad and the entire transaction aborts leaving you with all of the unprocessed data to go through. This can be quite frustrating. There is no facility to ignore constraints on the table. The alternative is to use a script and insert each row separately with autocommit on. This will let the good data in and you can store the "bad" rows for clean up later.

One other suggestion of a way to handle this was from Jan Wieck. He suggested loading into a "staging table" which has the same characteristics of the target table, but without the constraints and references. With that staging table you can query for data which would not match the constraints you will want enforced and you can correct the data or delete the row. Once all of the data in the staging table is correct, a single statement will put the data in place:

	INSERT INTO realtable SELECT * FROM stagingtable;

This is a good method because the data gets into the database quickly. You can use the existing table definition to construct your test queries. And you can also decide how to handle bad data rows--either by clean up or by deleting them.

Contributors: Wayne Armstrong wdarmst at bacchus.com.au, Nigel J. Andrews nandrews at investsystems.co.uk, Martijn van Oosterhout kleptog at svana.org, Jan Wieck JanWieck at Yahoo.com

Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!

Top