|
|||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
If you've been putting off upgrading your PostgreSQL installation, now is the time to do it. Especially if you are running a 7.4 installation, it is high time to move forward to 8.0 or 8.1. Here are some helpful things to remember to do when upgrading. As with any software upgrades, upgrading PostgreSQL will require planning, execution and testing. Do not skip any of those steps. Planning:
Execution:
Testing:
Contrary to common sense, but in keeping with the SQL Standards, the NOT IN operator acts in surprising ways when NULLs are involved. The IN and the NOT IN operators are not inverses of each other when NULLs are in the comparison list. When the left operand exists in the list, the result is true or false regardless of the existence of NULLs. The left operand is 1 in this case; the right hand operand is the list. In these examples, the left operand exists in the list.
When the left operand is not in the list, then your results will not just be true or false--they may also be NULL. The results will be true or false if and only if there are no NULLs in the list. However, if there is a NULL in the list, then the result will be NULL.
If the left operand is NULL, all of the IN results are NULL as expected.
This inconsistency can cause particular headaches with large lists, especially those created from a select clause. If you are unsure about whether the list created from a select clause may have a NULL value, you are probably already in trouble. In this example, we have two tables: CREATE TABLE numeral (dec_num integer, en_str text); CREATE TABLE roman (dec_num integer, rm_str text); INSERT INTO numeral VALUES (1, 'one'); INSERT INTO numeral VALUES (2, 'two'); INSERT INTO numeral VALUES (3, 'three'); INSERT INTO numeral VALUES (4, 'four'); INSERT INTO roman VALUES (1, 'I'); INSERT INTO roman VALUES (2, 'II'); INSERT INTO roman VALUES (3, 'III'); INSERT INTO roman VALUES (4, 'IV'); INSERT INTO roman VALUES (NULL, 'V');These two queries using NOT IN return NULL because of the NULL in r.dec_num. In the first case it is in the left operand; the second because there is a NULL in the list. It is surprising because it returns no results at all! SELECT r.dec_num, r.rm_str FROM roman r WHERE r.dec_num NOT IN (SELECT dec_num FROM numeral); dec_num | rm_str ---------+-------- (0 rows) SELECT n.dec_num, n.en_str FROM numeral n WHERE n.dec_num NOT IN (SELECT dec_num FROM roman ); dec_num | en_str ---------+-------- (0 rows)If the NOT IN in the above queries were changed to IN then you would see the rows 1 through 4 returned. There are several ways to deal with these issues. The first and formost ways to deal with data integrity in SQL is to use referential integrity and primary keys. For example, both tables should have been defined with PRIMARY KEYS. This would insure UNIQUE and NOT NULL columns in the first place. Also, with the UNIQUE on the numeral table then you could have added a REFERENCES numeral (dec_num) to the table definition of roman. The other way is to use JOIN instead of IN--its NULL behavior should be more intuitive. A JOIN on dec_num would bring in the result fastest.
SELECT dec_num, en_str, rm_str FROM numeral n JOIN roman USING (dec_num); dec_num | en_str | rm_str ---------+--------+-------- 1 | one | I 2 | two | II 3 | three | III 4 | four | IV (4 rows) A full outer join shows all of the rows. NULL values are in the columns where the JOIN did not succeed. SELECT n.dec_num AS n_dec_num, r.dec_num AS r_dec_num, n.en_str, r.rm_str FROM numeral n FULL OUTER JOIN roman r USING (dec_num); n_dec_num | r_dec_num | en_str | rm_str -----------+-----------+--------+-------- 1 | 1 | one | I 2 | 2 | two | II 3 | 3 | three | III 4 | 4 | four | IV 5 | | six | <-- in numeral only | | | V <-- in roman only (6 rows) To get all of the rows where numeral.dec_num is not in roman.dec_num, you can do a FULL OUTER JOIN and qualify the selection to choose those rows where there is a null in roman.dec_num. SELECT n.dec_num, r.dec_num, en_str, rm_str FROM numeral n FULL OUTER JOIN roman r USING (dec_num) WHERE n.dec_num IS NULL; dec_num | dec_num | en_str | rm_str ---------+---------+--------+-------- | | | V (1 row)Tom Lane says,"The above behavior is correct per spec. Feel free to argue its consistency with the SQL committee ;-)".
|
|||||||||||||||||||||||||||||||||||||||||
Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in! Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009 |