varlena
varlena
PostgreSQL Training,
Consulting & Support
General Bits
By A. Elein Mustain

7-Mar-2005 Issue: 100

Archives | General Tidbits | Google General Bits | Docs | Castellano | PortuguÍs | Subscriptions | Notifications | | Prev

General Bits is a column loosely based on the PostgreSQL mailing list pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see www.PostgreSQL.org.

Template databases
What are template1 and template0? 5-Mar-2005

When you do an initdb two default databases are created: template0 and template1. Template0 is the pristine default database from which template1 is created. All other databases are created by duplicating template1 unless another database is specified in the createdb statement. You can access template1 but you cannot access template0.

When you have a common set of objects, such as languages, functions, operators, etc., you can create those objects in template1. All databases created after doing so will be created with those things in place. For example, doing createlang 'plperl' template1 will install plperl in template1. Then all databases created afterwards will have plperl available.

Beware of pilot errors. If you inadvertently load database specific functions or other schema information into template1 they will also be created in each new database.

There was a case when someone dropped and recreated a database and tried to reload the schema. There were many errors regarding duplicate objects. It turned out that, by mistake, the schema had first been loaded into template1. The newly created database already had the objects created and this caused the error messages.

Contributors: elein at varlena.com
Creating an operator
Operator Creation 05-Mar-2005

PostgreSQL enables you to create operators. An operator is a symbol that represents a function. Here are some examples of operators.

OperatorDescriptionExampleResult
+number addition number 5 + 1217
@ absolute value@-66
||text concatenate text'foo' || 'bar''foobar'
@point contained in or on circle '(2,2)'::point @ circle( '(1,4)',5.3 ) TRUE

Notice that the @ operator has different meanings in different contexts. The difference is that they are defined with different operand types.

Operators are defined with with a function name and the operand types and a few other options. There are usually two operand types, the left and the right, but there are also unary operators like '-' (negative) that operate on a single value.

In Issue number #84 we created a handy new operator ||+ to concatenate text values while replacing the NULLs with spaces. Both the left and right arguments are text types and the function that performs this operation is nonull_append_strings(text,text).

	CREATE OPERATOR ||+ (
		LEFTARG = TEXT,
		RIGHTARG = TEXT,
		PROCEDURE = nonull_append_strings
	);
The function looks like this:
	CREATE OR REPLACE FUNCTION nonull_append_strings (text, text )
	RETURNS text AS '
		SELECT CASE WHEN $1 IS NULL 
		THEN $2 WHEN $2 IS NULL 
		THEN $1
		ELSE $1 || '' '' || $2
	END;
	' LANGUAGE SQL IMMUTABLE;

The other parameters to the CREATE OPERATOR statement help the optimizer manage the expression better. You may define a COMMUTATOR operator which swaps the left and the right arguments and gives the same results. For example, > is the commutator for <. You may specify a NEGATOR operator as well. For example, the negator operator for = is <>.

You may also specify a left or a right sort operator named SORT1 and SORT2 respectively. These operators indicate an ordering. For numbers these are < and >. The less than and greater than operators, LTCMP and GTCMP respectively, may be specified as well. In most cases they are the same as the SORT1 and SORT2 operators.

MERGES, when set, will enable the optimizer to use a sorted (merge) join on the operators. It will use the SORT or CMP operators to perform the sort. MERGES is restricted to the case when both operands are the same type.

If you specify a RESTRICT procedure (notice this is a procedure and not an operator), it will return an estimation of the number of rows for which this operator will return true. This is only helpful for binary operators which return a boolean. The optimizer will use this information to determine the overall cost of the expression.

In general, people do not write RESTRICT procedures but use the ones provided by the system, if they specify them at all. The selectivity functions provided by the system are: eqsel, negsel, scalarltsel, scalargtsel. If your operator acts like an equal statement, use equal select, for example.

JOIN functions specified for the operator are similar to the RESTRICT in that they are functions, not usually written and helpful only for binary operators that return boolean. These are the system defined JOIN functions: eqjoinsel, negjoinsel, scalarltjoinsel, scalargtjoinsel, areajoinset (2D areas), positionjonsel (2D positions),

If operand is similar to equality and is a binary operator returning TRUE or FALSE, then you can probably specify HASHES. This will give the optimizer the option to hash the values when evaluating the expression.

Contributors: elein at varlena.com, Josh Berkus josh at agliodbs.com
DBI-Link: Joining other data sources to PostgreSQL
DBI-Link: Using Perl's DBI from within PostgreSQL 06-Mar-2005

DBI-Link is an interface which allows you to access data sources outside of PostgreSQL as if they were tables within PostgreSQL. The data sources available are all of the data sources which interface with the Perl module DBI.pm.

When DBI-Link is installed then you can select the data directly by passing a connection string and casting the returning row in detail. Alternatively, if you define a connection to a specific data source, then in a separate schema in your database, all of the foreign tables are defined as views automatically and can be used as if they were PostgreSQL tables. The target data sources can be selected, inserted into, updated and deleted as long as the DBI interface for that data source enables that capability.

Installing DBI-Link

Download DBI-Link from http://www.pgfoundry.org/projects/dbi-link into a directory within your installation that is readable by the postgresql superuser. The contrib directory is recommended in order to keep track of your extensions. Untar the files and install DBI-Link into the PostgreSQL database that you want to use it in.
	tar xjvf dbi-link-1.0.0.tar.bz2
	cd dbi-link-1.0.0
	psql -f dbi_link.sql db

This will create a schema in the database db called "dbi-link" which contains all of the functions and tables necessary to manage DBI-Link. To use DBI-Link add this schema to your search path. An easy way to add this schema to the search path is to now invoke:

	SELECT set_config(
		'search_path' -- what to set
		, 'dbi_link,' || current_setting('search_path') -- what to set it to
		, false -- for session (true for transaction only)
	);
You can use set_config() in any setting, but it is particularly useful with DBI-Link. It is a new function introduce in PostgreSQL 8.0.

Now you need to store the information regarding your foreign data source so it can be used by DBI-Link. This can be simple or elaborate depending on how elaborate the DBI interface is for the type of data source you are targeting. Read up on the Perl module DBI.pm for information regarding specific data sources.

The following function, when invoked does two things: it stores the data source connection information and, in a new schema that you name, it creates views which enable you to access the foreign data sources. The id of the connection is a serial which you can capture with currval('dbi_link.dbi_connection_data_source_id_seq') immediately after the function call to get the actual id of the connection.

SELECT make_accessor_functions(
  'dbi:Pg:dbname=pg_example;host=host.pg_example.com;port=5432' 
                                       -- connection info
, 'pg_example_user'                    -- user
, 'pg_example_passwd'                  -- password
, '{AutoCommit => 1, RaiseError => 1}' -- specific DBI configuration
, 'public'                             -- remote schema
, NULL                                 -- remote catalog
, 'pg_example'                         -- local new schema
);

In this function call we pass the connection string as defined by DBI, a user name and password, any DBI specific options, what schema and catalog to access in the foreign data source if any, and finally, what schema in which to put all of the views for the foreign data source.

Getting past the setup enables you do make a query like this:

	select p.user_name, sum(a.amount)
	from public.people p, pg_example.credit
	where p.user_name = a.user_name
	group by user_name;
Notice that the credit table is in some other data source, perhaps another database or a csv file--any data source which has a DBI interface. (The type of data source is specified in the connection string when you called make_accessor_functions()).

Contributors: david at fetter.org, elein at varlena.com


Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!
Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009

Top
Google
Search General Bits & varlena.com Search WWW