[Bucardo-general] Activate/deactivate weirdness

Chris Keane chris.keane at zoomius.com
Wed Apr 29 20:48:00 UTC 2015


OK, we normally have 4 syncs per client running, 3 of which are more active
so will usually stall when the remote network hiccups or their IP address
changes etc.
To reactivate stalled syncs, this is the process we're using now:

1. Bucardo must be running
2. bucardo deactivate all (or bucardo deactivate sync1 sync2 sync3)
3. bucardo activate all  (allow validation to complete on all syncs,
however a kid is never started for them)
4. bucardo stop
5. bucardo start

On the bucardo start the kids that were (re)activate in step 3 will again
validate but now kid will also start for them. Not sure why a kid is never
started for previously-stalled syncs during step 3, I assume some kind of
state in memory needs to be cleared but I haven't dug into it.

We encountered another issue this last weekend in a test system using the
very latest cloned HEAD.... rows started going missing out of the database.
We have a foreign key constraint on a table with ON DELETE CASCADE
(specifically, the Interchange orderline table has a foreign key constraint
orderline.order_number > transactions.order_number). Orderline rows started
disappearing before our eyes. I strongly suspect that
session_replication_role either wasn't being set to replica or for some
reason wasn't working, and the ON DELETE CASCADE was deleting rows if the
corresponding row in transactions hadn't arrived yet. Certainly, after I
altered orderline to drop the FK constraint rows stopped disappearing. It
certainly gave me a few WTF moments!

Chris.


On Fri, Apr 3, 2015 at 6:07 PM, Chris Keane <chris.keane at zoomius.com> wrote:

>
> Hahahaha! Haha! Hahaha!
> If I could build a replicatable test case I'd like have poked through the
> code to find the problem myself and be suggesting a fix instead of showing
> all the bumps on my head.
>
> I'll be less thrashed in a few weeks and I'll see if I can figure it out
> then.
>
> But in summary:
>
> Database goes away inelegantly (like a network connection drops out or
> similar)
> Sync is marked as stalled
> Kicking the sync reports "Cannot kick an inactive sync"
> bucardo activate sync
> Kicking the sync reports "Cannot kick an inactive sync"
> bucardo deactivate sync
> bucardo activate sync (sometimes then validates the sync according the log
> but never progresses)
> Kicking the sync reports "Cannot kick an inactive sync"
> Stop bucardo
> bucardo activate sync
> start bucardo
> sometimes will pick up the sync, validate and start it, and sometimes
> ignore it.
>
> Chris.
>
>
>
> On Fri, Apr 3, 2015 at 5:59 PM, Greg Sabino Mullane <greg at endpoint.com>
> wrote:
>
>> On Sat, Mar 28, 2015 at 03:07:59PM -0700, Chris Keane wrote:
>> ...
>> > Eventually the sync will reactivate but I'm completely stumped since
>> some
>> > combinations work sometimes and not other times. In fact I'm sitting
>> here
>> > looking at three syncs that I just can't get restarted! Well, after
>> poking
>> > them continually for the last several hours while I was writing this
>> they
>> > just restarted.
>> >
>> > Any clues on the magic invocation I'm overlooking?
>>
>> Not offhand. The reactivation stuff is fairly new. Any chance you can
>> develop a simple replicatable test case?
>>
>> --
>> Greg Sabino Mullane greg at endpoint.com
>> End Point Corporation
>> PGP Key: 0x14964AC8
>>
>
>
>
> --
> *Chris Keane* * Track Intelligence Inc *  +1 (650) 703 5523 (cell)
>



-- 
*Chris Keane* * Track Intelligence Inc *  +1 (650) 703 5523 (cell)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.endcrypt.com/pipermail/bucardo-general/attachments/20150429/4af52012/attachment-0001.html>


More information about the Bucardo-general mailing list