[Bucardo-general] Replacing ontimecopy with cloning for b5
Greg Sabino Mullane
greg at endpoint.com
Tue Oct 15 02:57:26 UTC 2013
I've been pondering the "onetimecopy" functionality for Bucardo 5,
and wanted to run my current idea past the list. I'm thinking of renaming
it to minimize confusion - the current name is "clone", but I'm open to
anything except "onetimecopy". :)
Rather than making this a special type of sync, or even associating it
directly with a sync, I would like clone to be able to copy data from
one (or more) databases to one (or more) databases at any time, allowing
one to perform the equivalent of a onetimecopy at any point in time, even
without having syncs defined. (Recap: onetimecopy does a complete copy of
the contents of tables, in other words, with no WHERE clause, and usually
with a TRUNCATE on the recipient end.)
So the syntax would be to simply call a clone, and specify the source
and targets. For an existing sync, you can simply use the sync name:
$ bucardo clone sync=foobar
You can also specify a time for this to take effect. The default is
"right away".
$ bucardo clone sync=foobar start="3 hours"
Rather than a sync, you can specify a relgroup and dbgroup:
$ bucardo clone relgroup=foobar dbgroup=baz
Or you can specify the tables manually:
$ bucardo clone tables=foo,bar,myschema.bazzp dbgroup=baz
Or specify everything manually:
$ bucardo clone tables=foo,bar dbs=alpha:source,beta:target
The nice thing about the above is that the only previous information
Bucardo needs is how to connect to those databases!
Under the hood, bucardo will create relgroups and dbgroups on
the fly as needed. Each clone event will be entered into a new
table, looking like this:
CREATE TABLE bucardo.clone (
id INTEGER NOT NULL DEFAULT nextval('clone_id_seq'),
CONSTRAINT clone_id_pk PRIMARY KEY (id),
herd TEXT NULL,
CONSTRAINT clone_herd_fk FOREIGN KEY (herd) REFERENCES bucardo.herd(name) ON UPDATE CASCADE ON DELETE CASCADE,
dbgroup TEXT NOT NULL,
CONSTRAINT clone_dbgroup_fk FOREIGN KEY (dbgroup) REFERENCES bucardo.dbgroup(name) ON UPDATE CASCADE ON DELETE CASCADE,
startwhen TIMESTAMPTZ NOT NULL DEFAULT now(),
started TIMESTAMPTZ NULL,
ended TIMESTAMPTZ NULL,
cdate TIMESTAMPTZ NOT NULL DEFAULT now()
);
The bucardo program will send a notification to the MCP, which will
know to check this table for a new entry. (It also checks at a regular
interval for queues entries). If it finds one, it kicks off a new CTL/KID
to handle it. This KID is much simpler than others: it basically does a
truncate on all the targets, followed by a simple COPY from the source
to the targets.
We need to keep the existing onetimecopy behavior regarding pre-populated
tables in place. Right now, you can specify that onetimecopy only
copy to tables that are empty. This is nice when you want to repopulate
1 of 3 targets in a sync, for example. However, the new syntax would
also allow this to take place without that flag.
For more than one source, we'll simply assume that there is no overlap
and the sources will copy to each other, but will not truncate. So
for example, with databases A:source, B:source, and C:target, a
clone command will do the following:
C: TRUNCATE TABLE foo;
A: COPY * FROM foo => B: COPY FROM STDIN
A: COPY * FROM foo => C: COPY FROM STDIN
B: COPY * FROM foo => A: COPY FROM STDIN
B: COPY * FROM foo => C: COPY FROM STDIN
Thoughts?
--
Greg Sabino Mullane greg at endpoint.com
End Point Corporation
PGP Key: 0x14964AC8
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 163 bytes
Desc: Digital signature
URL: <https://mail.endcrypt.com/pipermail/bucardo-general/attachments/20131014/cc909356/attachment-0001.sig>
More information about the Bucardo-general
mailing list