[Bucardo-general] support for loss of connectivity to slave server/slave db going down
Greg Sabino Mullane
greg at endpoint.com
Wed Feb 3 16:06:40 UTC 2010
On Wed, Jan 20, 2010 at 10:50:06AM -0800, Omar Mehmood wrote:
> In the test scenario, when I restart the slave server and the bucardo
> processes on each master, approximately 300k un-replicated rows
> (approximately 150k from each server) are correctly replicated over,
> but the count doesn't match up exactly-- there are a small number of
> rows (2395 in this example) that didn't get replicated.
This is very hard to debug from here. Can you simplify it to a test
case. Is there anything about those rows that looks different? Do their
times (bucardo_delta.txntime) correspond to anything?
> > Bucardo is setup to restart itself by default, so if it's not, that's a
> > bug we need to address.
> In the test scenario, if I shutdown the slave server DBMS, the
> Bucardo processes die (the Bucardo setups are located on the master servers).
> [Wed Jan 20 18:28:21 2010] MCP Warning: Killed (line 890): Ping failed for remote database server
> [Wed Jan 20 18:28:21 2010] MCP Database problem, will respawn after a short sleep: 15
> [Wed Jan 20 18:28:36 2010] MCP Respawn attempt: /usr/local/bin/bucardo_ctl start "Attempting automatic respawn after MCP death"
This looks normal. Does Bucardo fail to restart after that line?
Greg Sabino Mullane greg at endpoint.com
End Point Corporation
PGP Key: 0x14964AC8
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 163 bytes
Desc: not available
Url : https://mail.endcrypt.com/pipermail/bucardo-general/attachments/20100203/41aa4455/attachment.bin
More information about the Bucardo-general