|
||||||||||||
|
||||||||||||
|
||||||||||||
|
||||||||||||
|
||||||||||||
A revision table usually has an item and a start date and and end date. An end_date of NULL usually means that the item is the most current version of the item. DROP TABLE items_118; CREATE TABLE items_118 ( item_id SERIAL, start_date DATE, end_date DATE, PRIMARY KEY (item_id, start_date) ); Selecting out the latest version is easy but we have to account for both open and close ended date intervals. Choose the ones with an open end date or a where today is between the start and end date. SELECT item_id, max(start_date), end_date FROM items_118 WHERE end_date IS NULL OR (end_date IS NOT NULL AND end_date >= current_date ) GROUP BY item_id, end_date ORDER BY item_id; Unfortunately this table definition alone does not prevent input of overlapping date intervals for any given item. We find items in the data with values like these: =# select * from items_118; item_id | start_date | end_date ---------+------------+------------ 1 | 2005-01-01 | 2005-03-01 1 | 2005-02-01 | 2005-03-15 1 | 2005-03-01 | 2 | 2005-04-01 | 2005-06-01 2 | 2005-05-01 | 3 | 2005-07-01 | 4 | 2005-08-01 | 5 | 2005-08-01 | 2005-12-01 (8 rows) Notice the overlaps in items 1 and 2. Oops. The first thing we need to do is to be able to see the overlaps clearly and then we need to fix them. To see which item overlaps which item, we want to draw together the item (first) and the item it overlaps with (second). This means a self join on the item_id. To identify the overlapping item we want the one where the first start date is before the second's start date and the second's end date is null. Or if the second's end date is not null, then if our end date is after its start date. Pay attention to this query. We will use it later. SELECT i.item_id, i.start_date, i.end_date, i2.start_date as overlap_start_date, i2.end_date as overlap_end_date FROM items_118 i JOIN items_118 i2 USING (item_id) WHERE i.start_date < i2.start_date AND (i.end_date IS NULL OR i.end_date > i2.start_date); If our data set were just this small we could do a few updates by hand and be done with it. However, if you are dealing with a great many records, it is worthwhile to write the not so trivial update statement. Using the query we created above, we select out the old and new start and end dates. If the item_id and start_date matches our overlapped record, we update the record's end_date with the new start date minus one day. UPDATE items_118 SET end_date = new_start - interval '1 day' FROM ( SELECT i.item_id, i.start_date AS old_start, i.end_date AS old_end, i2.start_date AS new_start, i2.end_date as new_end FROM items_118 i JOIN items_118 i2 USING (item_id) WHERE i.start_date < i2.start_date AND (i.end_date IS NULL OR i.end_date > i2.start_date) ) foo WHERE items_118.item_id = foo.item_id AND items_118.start_date = old_start;And then we double check that was right by looking at the overlapping records query. When you have a problem like this, it is correctable. But why not prevent it in the first place. An insert trigger can go back and fix an existing overlapping record right away, preventing the overlapping from occurring at all. The trigger function does much the same thing as the UPDATE statement did. It uses the same query we used before to bring up any overlapping record. If it find one, it updates the older record before committing the update on the new one.
CREATE OR REPLACE FUNCTION no_overlap() RETURNS TRIGGER AS $$ DECLARE r record; BEGIN SELECT INTO r i.item_id, i.start_date AS old_start, i.end_date AS old_end, i2.start_date AS new_start, i2.end_date AS new_end FROM items_118 i JOIN items_118 i2 USING (item_id) WHERE i.item_id = NEW.item_id AND i.start_date < i2.start_date AND (i.end_date IS NULL OR i.end_date > i2.start_date); IF FOUND AND r.old_end IS NULL OR r.old_end > r.new_start THEN UPDATE items_118 SET end_date = r.new_start - interval '1 day' WHERE r.item_id=items_118.item_id AND r.old_start = items_118.start_date; END IF; RETURN NEW; END; $$ LANGUAGE 'plpgsql'; CREATE TRIGGER nooverlaps AFTER INSERT ON items_118 FOR EACH ROW EXECUTE PROCEDURE no_overlap();Now any further updates of overlapping date intervals will be fixed and NULL ended end_dates will be set to the correct end date when the next revision is inserted.. The entire code for this article is available.
For scheduling applications is is sometimes necessary to generate a set of dates and times for repeating appointments. For example, you may want to schedule something for every two weeks, or every six months or every two days.
The interval arithmetic is not that obvious, but once you know the
trick it is simple.
To start off, we wrote a function that generates dates between
a start date and an end date. We threw in an increment so that we
could have every two days or every seven days. It is the basic
-- Generate dates for recurring intervals CREATE OR REPLACE FUNCTION every_day(start_date date,end_date date,incr integer) RETURNS SETOF date AS $$ DECLARE N integer=0; next_date date = start_date; BEGIN WHILE end_date > next_date LOOP RETURN NEXT next_date; next_date = start_date + (N + incr) * interval '1 day'; N = N + incr; END LOOP; RETURN; END; $$ LANGUAGE 'plpgsql'; -- every 2 days from today for 10 days. =# select * from every_day( current_date, current_date + 10, 2 ); every_day ------------ 2005-08-28 2005-08-30 2005-09-01 2005-09-03 2005-09-05 (5 rows) That is helpful. We could write an every_hour(), every_year(), every_month() function. But that would be tedious. Suppose we passed in the interval unit to a generic every_what() function. The only trick in this is to convert the interval unit into a interval type of one unit. We can do this by declaring a variable of type interval and constructing its assignment from the unit parameter.
CREATE OR REPLACE FUNCTION every_what( start_date date,end_date date,incr integer, unit text) RETURNS SETOF date AS $$ DECLARE N integer=0; next_date date = start_date; int_type interval = '1 ' || unit; BEGIN WHILE end_date > next_date LOOP RETURN NEXT next_date; next_date = start_date + (N + incr) * int_type; N = N + incr; END LOOP; RETURN; END; $$ LANGUAGE 'plpgsql'; -- select this day for each month for five months =# select * from every_what( current_date, (current_date + interval '5 months')::date, 1, 'months' ); every_what ------------ 2005-08-28 2005-09-28 2005-10-28 2005-11-28 2005-12-28 (5 rows) The functions here are written for date. If you write them for timestamp, you can cast the result to date, thus making the function even more versatile. This is left as an exercise for the reader.
|
||||||||||||
Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in! Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009 |