We talk a lot about the advantages of extensibility in PostgreSQL, but most of the time these advantages seem out of reach of novices and developers with no skill in C. Certainly you can do more with C functions and types than you can with simpler means, but there are a host of small, job-easing improvements you can make with nothing but SQL. This is one such.
Last week I found myself needing to concatenate ten TEXT fields in order to process them into a single full-text index, something like:
UPDATE companies_fti SET all_text = ( name || ' ' || summary || ' ' || description || ' ' || pr || ' ' || comments ) FROM companies WHERE companies.id = companies_fti.id;
Which is kind of annoying to type. Things got worse when I realized that almost all of those fields were nullable. They could have NULLs in them which would nullify the whole string, resulting in a big blank. Better use COALESCE:
UPDATE companies_fti SET all_text = ( name || COALESCE ( ' ' || summary, '') || COALESCE ( ' ' || description, '' ) || COALESCE( ' ' || pr, '') || COALESCE( ' ' || comments, '') ) FROM companies WHERE companies.id = companies_fti.id;
Pretty ugly, eh? The reality was worse; that's a simplified example with half the fields. If only I had an alternate version of the concatenation operator ( "||" ) which would allow me to add NULL text without nullifying the whole thing.
Wait a minute, this is PostgreSQL! I can have an alternate version of the concat operator. I flipped open my PostgreSQL documentation to the handy page on CREATE OPERATOR, and started.
First, I wanted to define a function which would take two text values, and return either one if one of them was null, or concatenate them with a space between if neither was null. If both were null, it should return null. Since it's an operator, I wanted to be as fast as possible, so I wrote it as a SQL function which could be "inlined" in later versions of PostgreSQL.
CREATE OR REPLACE FUNCTION nonull_append_strings ( text, text ) RETURNS text AS ' SELECT CASE WHEN $1 IS NULL THEN $2 WHEN $2 IS NULL THEN $1 ELSE $1 || '' '' || $2 END; ' LANGUAGE sql IMMUTABLE;
Then to create the operator. I decided on "||+" because it's easily memorable as "concatenate-plus". This is so easy you'll wonder why you didn't think of it: Then to create the operator. I decided on "||+" because it's easily memorable as "concatenate-plus". This is so easy you'll wonder why you didn't think of it:
CREATE OPERATOR ||+ ( LEFTARG = TEXT, RIGHTARG = TEXT, PROCEDURE = nonull_append_strings );
That's it! CREATE OPERATOR does, of course, accept a lot more parameters but none of them are needed for the simple result we want. All we really need is the left argument data type, the right argument data type, and the function to run. Now that query gets a lot neater:
UPDATE companies_fti SET all_text = ( name ||+ summary ||+ description ||+ pr ||+ comments ) FROM companies WHERE companies.id = companies_fti.id;
Elegant, isn't it? And easy, too. Have fun!
In Issue #83 we got PostGIS installed in a 7.4 database, loaded the PostGIS definitions:
psql pggis < postgis.sql psql pggis < spatial_ref_sys.sqland are ready to go. Much of what follows was gleaned from the Documentation from PostGIS itself.
PostGIS geometry types are stored differently that the regular built-in geometry types described in the PostgreSQL documentation. These types contain more meta data and need to be added in a special fashion. In this article, references to a geometry type will mean a reference to a PostGIS geometry type rather than the PostgreSQL geometric data types.
To add a geometry column, first create the table and then add the column using the AddGeometryColumn() function:
CREATE TABLE geo_objs ( id SERIAL, gid text ); SELECT ADDGeometryColumn('pggis','geo_objs','roads',-1,'LINESTRING',2);In this statement pggis is the database name, geo_objs is the table name and geo is the column name.
The -1 is the SRID--a (pseudo) foreign key reference into the SPATIAL_REF_SYSTEM to determine the coordinate system if any. If you don't know your coordinate system, don't have one and/or your geographer can't tell you, use -1.
LINESTRING is the gis type of the column. To specify a column of mixed gis types, use GEOMETRY instead of a specific type. Valid types are: POINT, LINESTRING, POLYGON, MULTPOINT, MULTILINESTRING, MULTIPOLYGON, GEOMETRYCOLLECTION.
The 2 is the spatial dimension of the type--either 2 or 3. Although PostGIS is capable of handling schema, the AddGeometryColumn function and general usage has not changed to take advantage of this feature since schemas were introduced in PostgreSQL.
You can now insert your data using SQL. The function GeometryFromText() takes geometric data, including the SRID of the coordinate system (if any), and the name of the object. In the case of a LINESTRING, it takes two points.
INSERT INTO ROADS_GEOM (ID,GEOM,NAME ) VALUES (1,GeometryFromText('LINESTRING(191232 243118,191108 243242)',-1),'Jeff Rd'); INSERT INTO ROADS_GEOM (ID,GEOM,NAME ) VALUES (2,GeometryFromText('LINESTRING(189141 244158,189265 244817)',-1),'Geordie Rd'); INSERT INTO ROADS_GEOM (ID,GEOM,NAME ) VALUES (3,GeometryFromText('LINESTRING(192783 228138,192612 229814)',-1),'Paul St'); INSERT INTO ROADS_GEOM (ID,GEOM,NAME ) VALUES (4,GeometryFromText('LINESTRING(189412 252431,189631 259122)',-1),'Graeme Ave'); INSERT INTO ROADS_GEOM (ID,GEOM,NAME ) VALUES (5,GeometryFromText('LINESTRING(190131 224148,190871 228134)',-1),'Phil Tce'); INSERT INTO ROADS_GEOM (ID,GEOM,NAME ) VALUES (6,GeometryFromText('LINESTRING(198231 263418,198213 268322)',-1),'Dave Cres');Generally, spatial data is imported into the database from some well known data store. A very common data format is an ESRI shape file and a data loader for these is bundled with PostGIS. This command takes the shape file roads.shapes and loads it into the pggis database into table geo_objs. An optional SRID may be added with a -s SRID option.
shp2pgsql roads.shapes pggis geo_objs < roads.sqlOther options include dropping the table (-d), appending the table (-a) or creating a new table (-c). The output is an SQL file with then should be loaded into your database:
psql pggis < roads.sql;or do it all at once as:
shp2pgsql roads.shapes pggis geo_objs | psql pggis
The opposite of shp2pgsql is pgsql2shp which will dump geometric data stored with PostGIS into ESRI shape files. To dump what we just loaded use:
pgsql2shp pggis geo_objs < roads_out.shapesOptional connection information, dimension information and column names may be specified.
Paul Ramsey in his talk Tidbits Build your own MapQuest makes an excellent point:
Once you have your data loaded, then you can query on it using the spatial operators and functions in conjunction with ordinary SQL. This will be the topic of a further article in a future General Bits Issue.
I am pleased to report the basic instructions for the download, build and install of Slony-I release 1.0.2 were perfect.
Slony is fairly version independent, but is still required to be built for each PostgreSQL version and installation on each machine participating in the replication. The same technique would be applied if the installations were on different machines. The PostgreSQL version must be 7.3 or later.
On one machine, I run several versions of PostgreSQL each built with source. So I made two copies of Slony and built them with each source tree. This took less than a minute for both.
1. ./configure -with-pgsourcetree=/local/src/postgresql-version
Then comes the hard part, setting things up. This will be addressed in future issues of General Bits
(Rumor also has it that CVS HEAD for slony is making considerable strides toward ease of use, including the ability to build without the PostgreSQL source tree. Stay tuned!)
Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!
Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009