|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Contrary to what Oracle and other unnamed database systems say, an empty string, '', is not the same as NULL. If you are used to other systems, it can be easy to make mistakes, though. So if you find that you are selecting rows with NULL values in columns that you do not expect to see, you may have a problem with empty strings in the place of NULLs. You will need to take a look at the data source to prevent future bad data, find the bad rows and fix them and then write on the chalkboard 100 times "NULL is not the same as an empty string" :-) To see whether the values are NULL or an empty string people submitted a number of interesting queries. The variety of things people choose to test to find out about the column was surprising. Some tested for NULL, some for NOT NULL, some for an empty string, some for not an empty string and some for an operation over the column. Some of these queries are better than others for this task. I prefer the coalesce() option, myself. But each selection shows a different way to look at the problem and that is interesting. Here are a few of them.
-- If you get back a row then jobcomment is not null and -- probably has the empty string in it ... SELECT jobcomment FROM tblJobIncharge WHERE id=51 AND jobcomment IS NOT NULL; -- To make sure that jobcomment is not [an empty string] you can try this query: -- If jobcomment is an empty string then you will get '0' in the job comment field, -- if it is really null then you will get nothing SELECT jassignedid, ascii(JobComments) FROM tblJobIncharge WHERE JobComments IS NOT NULL AND projectid >= 50 AND projectid <= 100 -- Any results that in that column are null will be displayed as 'X'. SELECT jassignedid, coalesce(JobComments,'X') FROM tblJobIncharge WHERE JobComments IS NOT NULL AND projectid >= 50 AND projectid <= 100 -- Which not only will show you real nulls (since null in a || operation gives -- null) but also indicate the string between the 'xx' and 'xx'. SELECT jassignedid, 'xx' || JobComments || 'xx' FROM tblJobIncharge WHERE projectid >= 50 AND projectid <= 100 -- The length of the column will tell how many spaces there are if there are any. SELECT jassignedid, JobComments, length(JobComments) FROM tblJobIncharge WHERE JobComments IS NOT NULL AND projectid >= 50 AND projectid <= 100; Do not forget to check your data sources for misplaced semantics. For example, form based applications will submit empty strings which may be made to be interpreted as NULL. In PostgreSQL this is how you could update based on a blank input. UPDATE tblJobIncharge SET JobComments = NULL WHERE LENGTH(RTRIM(JobComments)) = 0;
The question was raised whether views would become "obsolete" with the advent of the Set Returning Functions (SRF). (Actually in question are functions returning sets of rows FRSR which are a subset of SRFs :-) (See Issue #26 for information about functions returning sets, functions returning rows and functions returning sets of rows.) View will not become obsolete due to FRSR for several reasons. The first reasons are that views are well understood and very commonly used. They are easy to construct. People are just learning how to construct FRSRs and they are a new feature. The feature is is incredibly useful but if a view would be adequate a view can and should be used for simplicity's sake if for no other reason. But there is another reason. SQL is optimized very well by PostgreSQL. And the planner can optimize views because the definition is simply SQL. The planner, however, cannot optimize functions. Functions are opaque to the planner and can result in undeterminable variations so that a plan can not be accurately determined.
SQL functions are also not optimized. However, it could be possible
that in the future, SQL functions may be brought inline into
the calling SQL statement and then optimized with the rest of
the statement. This is not implemented at this time.
When using CURSORs you need to understand where the CURSOR lands after doing a FETCH. The FETCH syntax is: FETCH [direction] [count] { FROM | IN } cursorThe primary rule is that the cursor lands on the last row returned. If the cursor is at ROW[n], then the relative fetch begins on the next row, ROW[n+1]. The exception to this rule is when you perform a FETCH ALL. That statement leaves the cursor one past the LAST row, prepared to do a reverse fetch.
An array column in a table sometimes needs constraints. In general, it is not possible to create a constraint on individual elements of an array except by using a constraint function. In the case in this thread, the elements of the array letters must be in the id column of table lettervals. This was the initial try. It will NOT work: create table foo ( letters char(1)[3] REFERENCES lettervals(id) ...This doesn't work because it is trying to reference the full array value to a single letter value. A full array and a single letter are not the same type. Nigel Andrews suggested the following trigger function. This function loops over the array and checks whether the array element is in the lettervals table. If it is not, an error is raised aborting the INSERT. CREATE FUNCTION letter_fk () RETURNS TRIGGER AS ' DECLARE ind integer; BEGIN FOR ind IN array_dims(NEW.letters) LOOP PERFORM 1 FROM lettervals WHERE id = NEW.letters[ind]; IF NOT FOUND THEN RAISE EXCEPTION ''My foreign key constraint violation''; END IF; END LOOP; RETURN NEW; END; ' AS LANGUAGE 'plpgsql'; CREATE TRIGGER lettercheck BEFORE INSERT ON foo FOR EACH row EXECUTE PROCEDURE letter_fk(); Joe Conway reminds us that he is working on a fairly major upgrade to array support for 7.4 which will move some of the contrib array functionality into the backend. We have seen some of his functions in past issues of General Bits dealing with arrays and look forward to this addition in 7.4.
When doing bulk loading, sometimes the data is bad and the entire transaction aborts leaving you with all of the unprocessed data to go through. This can be quite frustrating. There is no facility to ignore constraints on the table. The alternative is to use a script and insert each row separately with autocommit on. This will let the good data in and you can store the "bad" rows for clean up later. One other suggestion of a way to handle this was from Jan Wieck. He suggested loading into a "staging table" which has the same characteristics of the target table, but without the constraints and references. With that staging table you can query for data which would not match the constraints you will want enforced and you can correct the data or delete the row. Once all of the data in the staging table is correct, a single statement will put the data in place: INSERT INTO realtable SELECT * FROM stagingtable;This is a good method because the data gets into the database quickly. You can use the existing table definition to construct your test queries. And you can also decide how to handle bad data rows--either by clean up or by deleting them.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in! Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009 |