This is a step by step reading of instructions and applying them to replicate a small database called gb. The plan is to replicate from a PostgreSQL 7.4 installation to a PostgreSQL 8.0 installation in order to facilitate upgrading the database.
Slonik is the command line interface to define the replication system. Slonik scripts will be written to create, update and change the replication scheme for gb. Tools are becoming available to simplify the creation of replication systems with Slony-I, however, this paper will describe the underlying Slonik required.
About the Databasegb is a simple eight table database containing General Bits Issues and Articles. The database is normalized and all tables have natural primary keys.
Slonik is a command line interface for Slony-I. It can connect to the various databases involved in the replication scheme to perform specific actions. It is an independent helper of Slony-I and of PostgreSQL.
The first commands for most slonik scripts is the identity of a group of databases and servers and the connection parameters for accessing each database in the group. This is followed by the action you wish to process. Slonik scripts are usually written embedded in shell scripts like this example. The commands to identify the cluster and node connection info are covered in the next section.
#!/bin/sh slonik << _END_ cluster name = 'cluster_name'; node 1 admin conninfo = 'dbname=db host=localhost port=5432 user=slony'; node 2 admin conninfo = 'dbname=db host=localhost port=5432 user=slony'; ...additional nodes... ...slonik commands... _END_The full set of Slonik Commands are documented along with Slony-I.
Node NetworksA node is the combination of a database and one slon process "belonging" to that database. A cluster is a set of nodes cooperating in a replication scheme.
It is suggested that all nodes have a path to all other nodes. With only two nodes this is simple to describe. With more nodes you must be sure to include a path to all other nodes whether or not they will be the path you expect replication to take.
Our first Slonik script is to initialize the cluster, define each node and to define the paths from each node to every other node. Notice that each node has an identifying number. The cluster is defined on the first node with init cluster and each subsequent node is defined using store node. The user is the slony superuser, which in this case is 'postgres'.
The path is defined by designating one node as a server and the other as a client for messaging. The terminology does not relate to the replicator/replica relationship but rather references the possible network path. The connection information in each command belongs to the server node. The client's slony daemon will connect to the server node using that connection information.
#!/bin/sh # gb_init.sh slonik << _END_ # # Define cluster namespace and node connection information # cluster name = gb; node 1 admin conninfo = 'dbname=gb host=localhost port=5434 user=postgres'; node 2 admin conninfo = 'dbname=gb host=localhost port=5430 user=postgres'; echo 'Cluster defined, nodes identified'; # # Initialize the cluster and create the second node # init cluster (id=1, comment='Node 1'); store node (id=2, comment='Node 2'); echo 'Nodes defined'; # # create paths # store path (server=1, client=2, conninfo='dbname=gb host=localhost port=5434 user=postgres'); store path (server=2, client=1, conninfo='dbname=gb host=localhost port=5430 user=postgres'); _END_
Typos and syntax errors can be isolated by using Slonik's echo command.
Listening for EventsEvents will occur throughout the cluster and you must tell Slony-I what nodes listen to what nodes to receive these Events. The Events may be replication information or administrative information that requires propagation throughout the cluster.
In the simple case of two nodes, they listen to each other. In any case, all nodes should be able listen to all other nodes The paths were defined so that this would be possible.
Specifying the Origin says which node the Receiver is listening for. The Origin of an event may or may not provide the event to the Receiver, however, the default is to do so. It is possible for Node 3 to listen for events initiated on Node 1 and have those events provided by Node 2 (which, one assumes, is also listening for events from Node 1).
In our case, we are having both nodes listen for events on the other and having the events provided by the origin node.
#!/bin/sh # gb_listen.sh slonik << _END_ # # Define cluster namespace and node connection information # cluster name = gb; node 1 admin conninfo = 'dbname=gb host=localhost port=5434 user=postgres'; node 2 admin conninfo = 'dbname=gb host=localhost port=5430 user=postgres'; # # Listen # store listen (origin=1, receiver=2, provider=1); store listen (origin=2, receiver=1, provider=2); _END_
Starting the slon processesOnce the nodes are set up to listen to each other for events, slon can be started. A slon processes must be started for each database participating in the replication. Give slon a chance to start itself and its threads.
The output in our example is sent to two logs which can be tailed to watch the activity and look for errors.
slon is essentially an event and messaging system. The events involve the replication of data and administrative information to facilitate replication of data.
#!/bin/sh # gb_startslon.sh # # Start Slony for each node # slon gb "dbname=gb user=postgres port=5434 host=localhost" > slon34.out 2>&1 & slon gb "dbname=gb user=postgres port=5430 host=localhost" > slon30.out 2>&1 & sleep 40;
Creating SetsReplication in Slony-I is done by subscribing to Sets of tables. The Set usually should be comprised of the group of related tables for an application or an entire schema.
First the set must be defined and the origin of data designated for that set. Then the tables may be added to the set by designating the set, the origin of the set, an ID for the table, the fully qualified table name, and optionally an alternate key. The origin of the set should be entered as it was in the set creation (redundantly).
The ID number of the set should be set carefully. It also designates the order in which the tables will be locked. This means that master tables should have IDs lower that detail tables. The relationship hierarchy of your schema should help you determine the order of the numbers. If the ordering of the table ids is backward or incorrect, there may be problems with deadlocking the slon process or PostgreSQL.
In our example, the issues table is the topmost master followed by
articles. Each of the other tables are lookup tables for those.
so the tables are numbered accordingly.
A set should be created just once and not have any active subscribers. In order to add tables to replication, a new set must be created and the two sets may be later combined with the MERGE SET Slonik command.
#!/bin/sh #gb_createset.sh slonik << _END_ # # Define cluster namespace and node connection information # cluster name = gb; node 1 admin conninfo = 'dbname=gb host=localhost port=5434 user=postgres'; node 2 admin conninfo = 'dbname=gb host=localhost port=5430 user=postgres'; create set (id=1, origin=1, comment='gb tables'); echo 'Set created'; set add table (set id=1, origin=1, id=1, full qualified name = 'public.issues', comment='Issues table'); set add table (set id=1, origin=1, id=2, full qualified name = 'public.articles', comment='Articles table'); set add table (set id=1, origin=1, id=3, full qualified name = 'public.arttext', comment='Article Text table'); set add table (set id=1, origin=1, id=4, full qualified name = 'public.sqlbase', comment='Full SQL keywords'); set add table (set id=1, origin=1, id=5, full qualified name = 'public.whoall', comment='All contributors'); set add table (set id=1, origin=1, id=6, full qualified name = 'public.contrib', comment='Contributors by Article'); set add table (set id=1, origin=1, id=7, full qualified name = 'public.keywords', comment='Keywords by Article'); set add table (set id=1, origin=1, id=8, full qualified name = 'public.sqlkw', comment='Subset of SQL keywords'); echo 'set 1 of gb tables created'; _END_
Subscribing to SetsOnce the set is created, it may be subscribed to. To subscribe to a set, identify the set, the node which can provide the set, the receiver of the set and whether or not the receiver of this set should be able to forward this set to another node. In our case, the origin node of the set is the same as the provider of the set, but for cascading subscriptions this is not necessarily the case. And even though there are only two nodes in this replication system, we are saying that the receiving node may forward the set. This is for the case of switching masters and in case we add other nodes to the cluster. Here, Node 2 is subscribing to Set 1 Originating on Node 1 and provided by Node 1.
#!/bin/sh # gb_subscribeset.sh slonik << _END_ # # Define cluster namespace and node connection information # cluster name = gb; node 1 admin conninfo = 'dbname=gb host=localhost port=5434 user=postgres'; node 2 admin conninfo = 'dbname=gb host=localhost port=5430 user=postgres'; subscribe set (id=1, provider=1, receiver=2, forward=yes); echo 'set 1 of gb tables subscribed by node 2'; _END_
Of course all of these scripts are assumed to be typed correctly and run as intended. Yeah, right.
UndoingThe simplest way of undoing is to start fresh. There are subtler ways of correcting mistakes by updating the underlying tables, however, these are not recommended unless you have intimate knowledge of the underlying tables.
ReferencesThis article is based on the following resources:
The scripts for this article are also available individually.
Future Slony-I articles will address failover and promotion of masters and
set up for more than two nodes.
Comments and Corrections are welcome. Suggestions and contributions of items are also welcome. Send them in!
Copyright A. Elein Mustain 2003, 2004, 2005, 2006, 2007, 2008, 2009