Bucardo/pgbench example
From Bucardo
This page describes the steps needed to replicate a sample database, created by the pgbench utility, with Bucardo. This will demonstrate simply master to slave behavior, using the pushdelta and fullcopy sync types.
Install Bucardo
The first step is to install Bucardo. Detailed instructions can be found on the installation page, but the quick steps are:
Install Perl modules
Bucardo requires the following Perl modules to be installed:
- DBD::Pg
- DBIx::Safe
Download and unpack Bucardo
The latest version of Bucardo can be found at the download page. Alternatively, you can pull the development version from git by doing:
git clone git://bucardo.org/bucardo.git
Either way, you should end up in a bucardo directory, and ready for the next step.
make and install
Run the following commands:
perl Makefile.PL make
The following step is optional but recommended:
make test
Finally, install as a user with appropriate rights. One way to do this is:
sudo make install
You should now have a global bucardo_ctl file available. Test that you can run it and that you are using the correct version:
bucardo_ctl --version
Create and populate the database
Bucardo needs a central database. The install option of bucardo_ctl will create and install this database for you. All you need to provide is the location of a Postgres instance you want to use, and a valid PID directory. For this example, we'll use the default values of no host, port 5432, and a user named 'Postgres'. We'll use the /tmp/bucardo directory as our piddir value.
mkdir /tmp/bucardo bucardo_ctl install --piddir=/tmp/bucardo
You will need to enter a "P" to tell it to proceed. If all goes well, you should see a message like this:
$ bucardo_ctl install --piddir=/tmp/bucardo This will install the bucardo database into an existing Postgres cluster. Postgres must have been compiled with Perl support, and you must connect as a superuser We will create a new superuser named 'bucardo', and make it the owner of a new database named 'bucardo' Current connection settings: 1. Host: <none> 2. Port: 5432 3. User: postgres 4. PID directory: /tmp/bucardo Enter a number to change it, P to proceed, or Q to quit: p Postgres version is: 8.4 Attempting to create and populate the bucardo database and schema Database creation is complete Connecting to database 'bucardo' as user 'bucardo' Updated configuration setting "piddir" Installation is now complete. If you see any unexpected errors above, please report them to bucardo-general@bucardo.org You should probably check over the configuration variables next, by running: bucardo_ctl show all Change any setting by using: bucardo_ctl set foo=bar
That's it! Time to setup our test databases. NOTE: In this example the source/master and target/slave databases reside within a single instance of he Postgres server, and this installation step is normally only required for the source/master node. In the real world where source/master and target/slave are hosted in separate Postgres servers, each slave node will need to have the role 'bucardo' manually created before proceeding to the next step.
Setup the pgbench databases
The pgbench utility that comes with Postgres can be used to create some simple test tables in an existing database. Let's create two databases, test1 (the master), and test2 (the slave).
createdb test1 createdb test2
Next, we'll install the pgbench files on each.
pgbench -i test1 pgbench -i test2
Now that we have some data, let's get Bucardo to replicate it.
Add the databases
Bucardo needs to know about each database it needs to talk to. The bucardo_ctl program does this with the add db option.
bucardo_ctl add db test1 bucardo_ctl add db test2
We've kept it simple for this example, but you generally will end up replicating databases with the same name, and thus should add an extra internal database name. Since we did not provide one, they default to the actual database names.
Add the tables
Bucardo also needs to know about any tables that it may be called on to replicate. Adding tables by the add table command does not actually start replicating them. In this case, we're going to use the handy add all tables feature. Tables are grouped together inside of Bucardo into herds, so we'll also place the newly added tables into a named herd. Finally, the history table has no primary key or unique index, so we cannot replicate it by using the pushdelta method, so we're going to exclude it from the alpha herd, using the -T switch, and add it in the next setup with the -t switch.
$ bucardo_ctl add all tables db=test1 -T history --herd=alpha --verbose New tables: public.accounts public.branches public.tellers New tables added: 3 Already added: 0 $ bucardo_ctl add all tables db=test1 -t history --herd=beta --verbose New tables: public.history New tables added: 1 Already added: 0
Add the syncs
A sync is a named replication event. Each sync has a source herd; because we created two herds above, we'll go ahead and create two syncs as well. One will be a pushdelta sync, the other will be a fullcopy sync.
$ bucardo_ctl add sync benchdelta source=alpha targetdb=test2 type=pushdelta Added sync "benchdelta"
$ bucardo_ctl add sync benchcopy source=beta targetdb=test2 type=fullcopy Added sync "benchcopy"
We are ready to kick off Bucardo at this point. Before we do, let's use the list options to bucardo_ctl to check everything out.
$ bucardo_ctl list herds Herd: alpha Members: public.branches, public.tellers, public.accounts Used in syncs: benchdelta Herd: beta Members: public.history Used in syncs: benchcopy
$ bucardo_ctl list syncs Sync: benchcopy (fullcopy ) beta => test2 (Active) Sync: benchdelta (pushdelta) alpha => test2 (Active)
$ bucardo_ctl list dbs Database: test1 Status: active Conn: psql -p 5432 -U bucardo -d test1 Database: test2 Status: active Conn: psql -p 5432 -U bucardo -d test2
$ bucardo_ctl list tables Table: public.accounts DB: test1 PK: aid (int4) Table: public.branches DB: test1 PK: bid (int4) Table: public.history DB: test1 PK: none Table: public.tellers DB: test1 PK: tid (int4)
Start Bucardo
The final step is to fire it up:
bucardo_ctl start
After a few seconds, the prompt will return. There will be a log file in the current directory called log.bucardo that you can look through. To disable the logfile and just rely on syslog use the --debugfile=0 argument. You can also verify that the Bucardo daemons are running by doing a:
ps -Afw | grep -i Bucardo
Test Replication
To verify that things are working properly, let's get some baseline counts:
$ psql -d test1 -At -c 'select count(*) from tellers' 10
$ psql -d test2 -At -c 'select count(*) from tellers' 10
$ psql -d test1 -c 'select * from tellers where tid = 1'
tid | bid | tbalance | filler
-----+-----+----------+--------
1 | 1 | 0 |
(1 row)
$ psql -d test2 -c 'select * from tellers where tid = 1'
tid | bid | tbalance | filler
-----+-----+----------+--------
1 | 1 | 0 |
(1 row)
Now let's make changes to that record, and verify that it gets propagated to the slave (test2)
$ psql -d test1 -c 'update tellers set bid=999 where tid = 1' UPDATE 1
$ psql -d test2 -c 'select * from tellers where tid = 1'
tid | bid | tbalance | filler
-----+-----+----------+--------
1 | 999 | 0 |
How about the history table, which has not primary key? We cannot track row by row changes, and don't want to copy the whole thing every time the table changes, so we've got to kick that sync manually when we want to change it:
$ psql -d -At test1 -c 'select count(*) from history' 0 $ psql -d -At test2 -c 'select count(*) from history' 0
$ pgbench -t3 test1
$ psql -At -d test1 -c 'select count(*) from history' 3 $ psql -At -d test2 -c 'select count(*) from history' 3
$ bucardo_ctl kick benchcopy
$ psql -At -d test1 -c 'select count(*) from history' 3 $ psql -At -d test2 -c 'select count(*) from history' 3
This ends the demonstration. Feel free to play around more. To stop Bucardo when done, just issue:
bucardo_ctl stop
As you experiment, you might also want to look at the syncs in more detail with:
bucardo_ctl status bucardo_ctl status benchdelta bucardo_ctl status benchcopy