[Bucardo-general] Bucardo vs. rubyrep

Wed Sep 8 11:36:20 UTC 2010

Udo Rader wrote:
> Hi,
>
> don't be afraid: I am not going to ask you the evil question which one
> is better ;-)
>
> Instead I have a couple of specific questions:
>
> Our situation is this: we have been replicating two postgres 8.4
> instances for the past 2 months using rubyrep. The reasons to go with
> rubyrep was that it claims to be "battle tested", installation and
> configuration were indeed extremely simple and the level of
> documentation seemed to be well enough.
>
> Now, after being productive for two months, we have found numerous
> issues with rubyreb (which I am not going to whine about on *this*
> mailing list :-) and those issues seem to be quite important to us, so
> we are considering a change towards bucardo.
>
> But beforehand, I have a couple of questions :-)
>
> #1 "huge updates"
> ------
> How does bucardo deal with "simple" update statements, that change a
> huge number of records.
>
> Say I have a table "foobar" with >500K rows. What if I do this:
>
> UPDATE foobar SET last_verification=NOW() WHERE 1=1
>
> Will bucardo generate >500K entries in the "things2replicate"  table or
> will it handle it "more intelligently"?
>
> UPDATEs like these have caused most of our headache, because first they
> completely stalled the replication process and then finally killed it
> (because the amount of memory required to deal with such a huge number
> of pending changes just does not seem to exist).
>   

I think this is the issue with all PostgreSQL trigger based replication
systems (no real way around it in the current system - I patched my copy
to only do 200k transaction in one hit, but that wouldn't solve your issue)

That said, bucardo only stores the 'things2replicate' as a bucardo_delta
table that has only the primary key of the rows that need replicating. 
If using pushdelta replication (master-slave) rather than swap
(master-master) the memory requirements for my DB replicating 6m rows in
one chunk using composite and compound primary keys is only 1.9G RAM as
the actual row copy is done using DELETE .. COPY.
> #2 "senseless updates"
> ------
> Occasionally, some applications will "update" a record even though the
> data has not changed.
>
> IIRC, beginning with 8.2, postgres trigger can determine if the "new"
> row differs from the "old" row using a construct like this:
>
> IF NEW IS DISTINCT FROM OLD THEN ...
>
> Does bucardo utilize this functionality (thus reducing the amount of
> replicated data)?
>   

No it doesn't thank $deity.. if it did I'd be currently copying half a
terrabyte of data over a 100k hi latency link.

> #3 connectivity problems over WAN
> ------
> Every now an then we see rubyrep die because of connectivity problems.
> How "stable" is bucardo when it comes to network problems between two
> WAN connected servers?
>
> Will it automagically resume the replication if the nodes have been
> disconnected for say one hour or so?
>   

Depends on your config.  Mine is set to auto-retry only 3 times with 1
minute intervals (Australia -> USA East Coast) and in 3 months only 1
network outage caused manual intervention and that was when the global
routing 'had issues' for 20 minutes for one of my peers.

> #4 memory consumption
> ------
> Is there a way to roughly calculate the memory requirements for bucardo,
> especially for a situation where the nodes have been disconnected for
> some time and need to resynchronize?
>   

Already gave ideas above.  Using swap method (master-master) the server
has to grab the rows and compare/insert/update which seems to consume
all the memory the same 6m rows grows the process to 20G.
> #5 commercial support
> ------
> Well, the level of support for rubyrep is just ... "limited". So just in
> case we need specific support, is there any kind of commercial support
> for bucardo?
>   

I can't answer that but I can tell you I hack the code (a little) and
Greg who is the main developer is responsive within 24 hours or so (ie
sometimes seconds, other times 'tomorrow') and when we found a bug in
the composite primary key storage the other day a patch was out within
48 hours.

Michelle