[Bucardo-general] Replication stopped with Bucardo - no idea why

Torbjørn Kristoffersen tk at mezzanineware.com
Thu Sep 12 19:35:42 UTC 2013


We got about 50 tables replicated with Bucardo - two masters running
PostgreSQL 9.3

I don't understand what's going on now, and I can't see anything in the
logs that stands out from what has been going on when things worked 100%.

Apparently some table has received quite a few thousand updates, perhaps
50000 new or updated records, and after this Bucardo doesn't seem to
replicate anymore.

The logs look the same as they've always looked, e.g. as shown below.

Where to start looking? I have the 'standard conflict' resolution set to
'latest', and that has worked just fine. The only thing that happened
recently were the 50000 updates, and one case where we stopped bucardo
temporarily to add a column (not a PK) to a single table on both databases,
then restarted Bucardo again.

This is actually a production system (we did do due diligence with Bucardo
and it passed our testing phase), so it's actually quite critical at this
point that we get this working again.

PS before someone asks... The "Could not add q. .. Sending manual
notification" is as far as I know "normal behavior" in Bucardo, and this
occurred constantly even from the start, when Bucardo did replicate.

[Thu Sep 12 21:24:11 2013]  KID Target delta count for
public.file_area_records: 234
[Thu Sep 12 21:24:11 2013]  KID Source delta count for
public.file_area_tags: 0
[Thu Sep 12 21:24:11 2013]  MCP Got notice "bucardo_kick_sync_vrmsync" from
24999 on database saturn_system
[Thu Sep 12 21:24:11 2013]  MCP Sent a kick request to controller 15877 for
sync "vrmsync"
[Thu Sep 12 21:24:11 2013]  KID Target delta count for
public.file_area_tags: 0
[Thu Sep 12 21:24:11 2013]  KID Source delta count for
public.global_settings: 0
[Thu Sep 12 21:24:11 2013]  CTL Got notice "bucardo_ctl_kick_vrmsync" from
15862
[Thu Sep 12 21:24:11 2013]  CTL Could not add to q
sync=vrmsync,source=dacc_system,target=saturn_system,count=1. Sending
manual notification
[Thu Sep 12 21:24:11 2013]  KID Target delta count for
public.global_settings: 0
[Thu Sep 12 21:24:12 2013]  KID Source delta count for public.permissions: 0
[Thu Sep 12 21:24:12 2013]  KID Target delta count for public.permissions: 0
[Thu Sep 12 21:24:12 2013]  KID Source delta count for
public.permissions_user_access: 0
[Thu Sep 12 21:24:13 2013]  KID Target delta count for
public.permissions_user_access: 0
[Thu Sep 12 21:24:13 2013]  KID Source delta count for public.persons: 9683
[Thu Sep 12 21:24:38 2013]  CTL Got notice "bucardo_ctl_kick_vrmsync" from
15862
[Thu Sep 12 21:24:38 2013]  CTL Could not add to q
sync=vrmsync,source=dacc_system,target=saturn_system,count=1. Sending
manual notification
[Thu Sep 12 21:24:39 2013]  MCP Got notice "bucardo_kick_sync_vrmsync" from
26073 on database saturn_system
[Thu Sep 12 21:24:39 2013]  MCP Sent a kick request to controller 15877 for
sync "vrmsync"
[Thu Sep 12 21:24:40 2013]  CTL Got notice "bucardo_ctl_kick_vrmsync" from
15862
[Thu Sep 12 21:24:40 2013]  CTL Could not add to q
sync=vrmsync,source=dacc_system,target=saturn_system,count=1. Sending
manual notification
[Thu Sep 12 21:24:41 2013]  MCP Got notice "bucardo_kick_sync_vrmsync" from
24912 on database saturn_system
[Thu Sep 12 21:24:41 2013]  MCP Sent a kick request to controller 15877 for
sync "vrmsync"
[Thu Sep 12 21:24:41 2013]  CTL Got notice "bucardo_ctl_kick_vrmsync" from
15862
[Thu Sep 12 21:24:41 2013]  CTL Could not add to q
sync=vrmsync,source=dacc_system,target=saturn_system,count=1. Sending
manual notification
[Thu Sep 12 21:24:43 2013]  MCP Got notice "bucardo_kick_sync_vrmsync" from
24837 on database saturn_system
[Thu Sep 12 21:24:43 2013]  MCP Sent a kick request to controller 15877 for
sync "vrmsync"
[Thu Sep 12 21:24:43 2013]  CTL Got notice "bucardo_ctl_kick_vrmsync" from
15862
[Thu Sep 12 21:24:43 2013]  CTL Could not add to q
sync=vrmsync,source=dacc_system,target=saturn_system,count=1. Sending
manual notification

Also saw this one:

[Thu Sep 12 21:20:55 2013]  KID Target delta count for
public.activities_persons_users: 382900

That's quite a high delta but not really uncommon for that specific table.


Regards,

Torbjorn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.endcrypt.com/pipermail/bucardo-general/attachments/20130912/f55dbf2c/attachment.html>


More information about the Bucardo-general mailing list