[Bucardo-general] Conflict strategy 'bucardo_latest' and high latency
Hans van der Riet
hans at electude.nl
Sat Mar 1 00:19:54 UTC 2014
I have been performance testing 4.99.11 over high latency connections
between several servers. The sync contains over 70 tables. On the nodes
a simple test script continously modifies random rows in randomly
selected tables. The performance is surprisingly good, until a conflict
arises. The default 'bucardo_latest' conflict strategy is slow and takes
a few minutes to pick a winner. Typically this results in more conflicts
in subsequent sync runs, until Bucardo can't serialize due to concurrent
updates on one of the nodes. All sync runs from then on fail
(serialization errors) until the test scripts are turned off.
Digging in to this I found that the conflict strategy works quite
differently from what I expected: it does not pick a winner for each
conflicting row, but it picks a winning database for all conflicts based
on the latest update. This can produce some quite unexpected results.
For example when you run:
psql -h node1 -c "update t set node='node1' where id=1" test
psql -h node2 -c "update t set node='node2' where id=1" test
psql -h node3 -c "update t set node='node3' where id=2" test
and these updates are processed in one sync run, the result looks like
this, once the conflict has been resolved:
id | node
1 | old value
2 | node3
Since node3 made the last update, it is the winner for all conflicting
rows. Row id 1 is conflicting (node1 and node2), so the old value from
node3 is restored. In the (theoretical) scenario of the performance test
above this produces really undesired results, since syncing will fail
for a long time with lots of conflicts.
The reason it takes so long to pick a winner is that Bucardo queries all
delta tables individually on all nodes. When latency is high this will
take a while.
Remarks / questions:
1. Maybe some side effects of bucardo_latest can be avoided, e.g. ignore
databases that are not part of the conflict. Documentation / man page
suggests something else.
2. Performance of the current bucardo_latest strategy can be improved
dramatically when each node is only queried once, using a UNION across
all delta tables to find MAX(txntime).
3. Are the bucardo_source, bucardo_target, bucardo_skip and
bucardo_random strategies still working in 4.99?
Hans van der Riet
More information about the Bucardo-general