Varlena, LLC | PostgreSQL General Bits Newsletter

PostgreSQL Training,
Consulting & Support

By A. Elein Mustain

29-Mar-2004 Issue: 68

Archives |

General Tidbits |

Google General Bits |

Prev |

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

What's a Varlena?
Variable Length Arrays	27-Mar-2004

My domain name is varlena.com and my consulting company is named Varlena, LLC. I've been asked many times about the significance of the name. For those who do not know about varlenas and those who want to know more about them, this article explains all.

varlena (vahr-lee-nah)

A varlena is a variable length array. It is a key C data structure used in the code for the Postgres database system. It is seen in Postgres, Postgres95, PostgreSQL, Illustra and Informix Universal Server Databases. The Illustra Database and Informix Universal Server were based on Postgres.

I first heard of varlenas while working at Illustra in 1993. I worked closely with varlenas and data type conversions as part of my work with gateways and the DataBlade (C) API. Rumor also had it that mer's cat had been named varlena after that euphoniously named data structure.

The name varlena originated at the University of California, Berkeley with the Postgres project. In PostgreSQL code, the earliest reference is in pgsql/src/include/postgres.h CVS version 1.1 imported October 31, 1996. There are some google references to varlena in other code occasionally, however, I suspect some Berkeley graduates were involved in that code. There is also a reference to a character named Varlena in EverQuest (tm) and I have the same suspicion about Berkeley graduates. There is also the occassional use of Varlena as a proper name. In fact people sometimes assume it is my name, but it is not. (My name is hard enough for most people to pronounce. (Elein ee-leen, if you want to know :-)

A varlena is the basis of variable length data throughout Postgres and is particularly key to the concept of user defined data types. The varlena structure is defined to hold any kind of data. It is used as the basis for all of the variable length data types. The ability to map the bytes following the length word to any C data structure enables user defined data types, numerics, arrays, text types, bytea types, etc. to be stored and carried throughout the system in a uniform way.

The data structure is defined in ...pgsql/src/include/c.h as:

	struct varlena
	{
	   int32    vl_len;
	   char     vl_dat[1];
	};

The first four bytes store the length and the fifth byte is the start of the variable length array. The data portion is allocated when the data item is created. It looks like this when it has a value:

In C functions, the following macros enable access to the individual pieces of the varlena:

	VARSIZE(varlenaptr);  # Size of entire structure
	VARDATA(varlenaptr);  # Pointer to data element
	VARHDRSZ(varlenaptr); # Size of length portion of structure

With the advent of TOAST, a varlena is wrapped in a varattrib data structure. This structure enables TOASTing information to be associated with a varlena. That information determines whether it data is TOASTable and whether it is TOASTED and provides access information if it is TOASTed.

Because any data can be stored in the data area of a varlena, a varlena notoriously is not null terminated. Some of the data types that are stored in varlenas data areas, however, do store null terminated strings in them. For example, cstrings do just that. But null termination is just an attribute of that particular data type held by the varlena. Don't count on it being there in general.

Contributors: elein at varlena.com

Rules and views and cascading deletes
[GENERAL] bug in delete rule ? 25-Mar-2004

Delete rules that delete from multiple tables where those tables are the basis of the view creates a catch-22 situation.

Rules, as you know, are also known as rewrite rules. When you create a rule on delete from myview ... that code is literally substituted with the rule logic.

For example, here is a view joins two tables on the id column:

	CREATE VIEW all_centers AS
	   SELECT t0.center_id, t0.institution, t0.category,
	      t0.street_number, t0.street_name, t0.postcode,
	      t0.district_name, t0.city_name, t0.province_name,
	      t0.country, 
	      t1.update_date, t1.number_of_beds, t1.chief,
	      t1.phone_center, t1.fax_center
	   FROM
	      center t0, center_var t1
	   WHERE
	      t0.center_id = t1.center_id;

A rule is created on this view such that a delete on the view will delete the rows in both center and center_var. NOTE: This does not work as expected.

	CREATE RULE all_centers_del AS ON DELETE TO all_centers
	   DO INSTEAD (
	      DELETE FROM center_var WHERE center_id = OLD.center_id;
	      DELETE FROM center WHERE center_id = OLD.center_id;
	);

Whenever a row is deleted from the view,

	DELETE FROM all_centers WHERE center_id = 5

the meaning of following code replaces it:

	DELETE FROM center_var 
	WHERE center_id = 
	   (select from all_centers where center_id = 5);

	DELETE FROM center 
	WHERE center_id = 
	   (select center_id from all_centers where center_id = 5);

This delete rule fails on the second delete because when the row is deleted from the first table it no longer exists in the view. By the time it gets to the second statement deleting from the table center there is no matching row. One would think that the view's row would remain in existance for the duration of the rule, but it does not.

The solution for this dilemma depends on the relationships between the tables in the view. If there exists a foreign key relationship between the tables, then it is best to ensure that the foreign keys cascade on delete. Then when the master is deleted, the detail is also deleted. But if you use this, be very clear that the behavior reflects the relationship between the tables accurately.

This is what the foreign key relationship would look like if center is the master table and var_center is the detail table:

	CREATE TABLE center (
	   id	INTEGER PRIMARY KEY,
		...,
	)
	CREATE TABLE var_center (
	   id	INTEGER,
		...,
		PRIMARY KEY (id, var_id)
		FOREIGN KEY(id) REFERENCES center
	      ON DELETE CASCADE
	)

With the DELETE CASCADE defined thus, then the delete rule on the view all_centers would change. The second delete, the one from center_vars, would be removed.

The original view with the cascade delete, does beg the question of the relationship between the tables. If there is a master detail relationship and cascade delete is defined, then all of the other details for that master will be deleted. Is this what is wanted by the original deletion rule? Or is it possible that there are two different tables with the same single unique primary key? If there are two tables with a single unique primary key, then why is the data in two separate tables? The answers to these questions depend on the intent of the schema design.

Contributors: ferriol sylvain.ferriol at imag.fr, Tom Lane tgl at sss.pgh.pa.us, günter strubinsky strubinsky at acm.org elein at varlena.com

Creating Deferred Constraints
[SQL] special integrity constraints 22-Mar-2004

Formally, only Foreign Key Constraints are allowed to be deferrable. But you may have a situation with multiple statements in a transaction where the sum effect of the statements can meet a complex constraint but individually they cannot. This requires that the constraint be deferred.

Informally, it is possible to defer constraints other than foreign key constraints using an undocumented syntax. WARNING! It is likely that this work around will exist until deferred general constraints are supported, but that is not guaranteed.

For example, the user interface may allow you to set up a student field trip. In one transaction, you may want to add students and chaperones. When the data entry is ready to be committed the ratio of students to chaperones should be 5:1. (This example is somewhat contrived, but bear with me for the sake of seeing how deferred constraints are created.)

First you'll want a trigger function to do the ratio check, avoiding division by zero errors as we go.

create or replace function field_trip()
returns trigger as
'
DECLARE
   students integer;
   chaps integer;
BEGIN
   select into students count(*) from fieldtrip_students;
   select into chaps count(*) from fieldtrip_chaperones;

   if (chaps = 0 and students <> 0) or 
	   (chaps <> 0 and students/chaps < 5) then
    raise exception
      ''There are % students and % chaperones.
        There must be 1 chaperone for each 5 students.'',
        students, chaps;
   else
      return NEW;
   end if;
   return NEW;
END;
' language 'plpgsql';

Next we will want to put the constraint on the insert of students and the deletion of chaperones.

	CREATE CONSTRAINT TRIGGER sc
	AFTER INSERT 
	ON fieldtrip_students
	INITIALLY DEFERRED
	FOR EACH ROW EXECUTE PROCEDURE field_trip();

	CREATE CONSTRAINT TRIGGER sc
	AFTER DELETE
	ON fieldtrip_chaperones
	INITIALLY DEFERRED
	FOR EACH ROW EXECUTE PROCEDURE field_trip();

It is important to note that the trigger will not actually be called sc. To see what it is named, use \d in psql. You want to know what it is named so that you can drop the trigger if necessary. The double quotes around the trigger name are required.

	=# \d fieldtrip_students;
	Table "public.fieldtrip_students"
	 Column | Type | Modifiers 
	--------+------+-----------
	 sname  | text | not null
	Indexes:
	    "fieldtrip_students_pkey" primary key, btree (sname)
	Triggers:
	    "RI_ConstraintTrigger_3336248" AFTER INSERT 
	     ON fieldtrip_students 
	     DEFERRABLE INITIALLY DEFERRED 
	     FOR EACH ROW EXECUTE PROCEDURE field_trip()

	=# drop trigger "RI_ConstraintTrigger_3336249"
	   on fieldtrip_chaperones;
	DROP TRIGGER

To enter a set of students and chaperones, do it in a block transaction. When the commit is issued, the check with run and either rollback or commit your transaction.

The full code for this example is available for you to test and change.

Contributors: Erik Thiele erik at thiele-hydraulik.de, Achilleus Mantzios achill at matrix.gatewaynet.com, Stephan Szabo sszabo at megazone.bigpanda.com, Bruno Wolff III bruno at wolff.to, elein at varlena.com

Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!

Top