[Bucardo-general] Replacing ontimecopy with cloning for b5

Tue Oct 15 02:57:26 UTC 2013

I've been pondering the "onetimecopy" functionality for Bucardo 5, 
and wanted to run my current idea past the list. I'm thinking of renaming 
it to minimize confusion - the current name is "clone", but I'm open to 
anything except "onetimecopy". :)

Rather than making this a special type of sync, or even associating it 
directly with a sync, I would like clone to be able to copy data from 
one (or more) databases to one (or more) databases at any time, allowing 
one to perform the equivalent of a onetimecopy at any point in time, even 
without having syncs defined. (Recap: onetimecopy does a complete copy of 
the contents of tables, in other words, with no WHERE clause, and usually 
with a TRUNCATE on the recipient end.)

So the syntax would be to simply call a clone, and specify the source 
and targets. For an existing sync, you can simply use the sync name:

$ bucardo clone sync=foobar

You can also specify a time for this to take effect. The default is 
"right away".

$ bucardo clone sync=foobar start="3 hours"

Rather than a sync, you can specify a relgroup and dbgroup:

$ bucardo clone relgroup=foobar dbgroup=baz

Or you can specify the tables manually:

$ bucardo clone tables=foo,bar,myschema.bazzp dbgroup=baz

Or specify everything manually:

$ bucardo clone tables=foo,bar dbs=alpha:source,beta:target

The nice thing about the above is that the only previous information 
Bucardo needs is how to connect to those databases!

Under the hood, bucardo will create relgroups and dbgroups on 
the fly as needed. Each clone event will be entered into a new 
table, looking like this:

CREATE TABLE bucardo.clone (
  id        INTEGER     NOT NULL DEFAULT nextval('clone_id_seq'),
              CONSTRAINT clone_id_pk PRIMARY KEY (id),
  herd      TEXT            NULL,
              CONSTRAINT clone_herd_fk FOREIGN KEY (herd) REFERENCES bucardo.herd(name) ON UPDATE CASCADE ON DELETE CASCADE,
  dbgroup   TEXT        NOT NULL,
              CONSTRAINT  clone_dbgroup_fk FOREIGN KEY (dbgroup) REFERENCES bucardo.dbgroup(name) ON UPDATE CASCADE ON DELETE CASCADE,
  startwhen TIMESTAMPTZ NOT NULL DEFAULT now(),
  started   TIMESTAMPTZ     NULL,
  ended     TIMESTAMPTZ     NULL,
  cdate     TIMESTAMPTZ NOT NULL DEFAULT now()
);

The bucardo program will send a notification to the MCP, which will 
know to check this table for a new entry. (It also checks at a regular 
interval for queues entries). If it finds one, it kicks off a new CTL/KID 
to handle it. This KID is much simpler than others: it basically does a 
truncate on all the targets, followed by a simple COPY from the source 
to the targets.

We need to keep the existing onetimecopy behavior regarding pre-populated 
tables in place. Right now, you can specify that onetimecopy only 
copy to tables that are empty. This is nice when you want to repopulate 
1 of 3 targets in a sync, for example. However, the new syntax would 
also allow this to take place without that flag.

For more than one source, we'll simply assume that there is no overlap 
and the sources will copy to each other, but will not truncate. So 
for example, with databases A:source, B:source, and C:target, a 
clone command will do the following:

C: TRUNCATE TABLE foo;
A: COPY * FROM foo => B: COPY FROM STDIN
A: COPY * FROM foo => C: COPY FROM STDIN
B: COPY * FROM foo => A: COPY FROM STDIN
B: COPY * FROM foo => C: COPY FROM STDIN

Thoughts?

-- 
Greg Sabino Mullane greg at endpoint.com
End Point Corporation
PGP Key: 0x14964AC8
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 163 bytes
Desc: Digital signature
URL: <https://mail.endcrypt.com/pipermail/bucardo-general/attachments/20131014/cc909356/attachment-0001.sig>