[Bucardo-general] Another bug?
Michelle Sullivan
michelle at sorbs.net
Tue Oct 22 14:00:17 UTC 2013
Michelle Sullivan wrote:
> (31523) [Tue Oct 15 09:41:25 2013] KID Totals: deletes=69 inserts=339
> conflicts=1
> (31523) [Tue Oct 15 09:41:26 2013] KID Expected one row from
> end_syncrun, but got 4
> (31523) [Tue Oct 15 09:41:26 2013] KID Unable to correctly update
> syncrun table! (count was 4)
> (31523) [Tue Oct 15 09:41:30 2013] KID Expected one row from
> end_syncrun, but got 4
> (31523) [Tue Oct 15 09:41:35 2013] KID Expected one row from
> end_syncrun, but got 4
> (31523) [Tue Oct 15 09:41:41 2013] KID Expected one row from
> end_syncrun, but got 4
> (31523) [Tue Oct 15 09:41:45 2013] KID Delta count for
> sorbs_corkscrew.public.audit : 1
> (31523) [Tue Oct 15 09:41:46 2013] KID Totals: deletes=3 inserts=3
> conflicts=0
> (31523) [Tue Oct 15 09:41:47 2013] KID Expected one row from
> end_syncrun, but got 4
> (31523) [Tue Oct 15 09:41:47 2013] KID Unable to correctly update
> syncrun table! (count was 4)
> (31523) [Tue Oct 15 09:41:51 2013] KID Expected one row from
> end_syncrun, but got 4
>
>
This might just fix it... in addition the lower part of the patch stops
the kid exiting if pg_cancel fails (usually a serialization error)-
which causes majority of orphaned entries on my systems:
--- Bucardo.pm.orig 2013-10-14 10:44:09.000000000 +0000
+++ Bucardo.pm 2013-10-22 13:58:57.000000000 +0000
@@ -1900,6 +1900,18 @@
## At this point, the PID file does not exist or the kid is
not responding
if ($resurrect) {
## XXX Try harder to kill it?
+
+ ## First clear out any old entries in the syncrun table
+ $sth = $sth{ctl_syncrun_end_now};
+ $count = $sth->execute("Old entry died (CTL $$)",
$syncname);
+ if (1 == $count) {
+ $info = $sth->fetchall_arrayref()->[0][0];
+ $self->glog("Old syncrun entry removed during
resurrection, start time was $info", LOG_NORMAL);
+ }
+ else {
+ $sth->finish();
+ }
+
$self->glog("Resurrecting kid $syncname, resurrect was
$resurrect", LOG_DEBUG);
$self->{kidpid} = $self->create_newkid($sync);
@@ -4823,8 +4835,10 @@
## Roll everyone back
for my $dbname (@dbs_dbi) {
my $dbh = $sync->{db}{$dbname}{dbh};
- $dbh->pg_cancel if $dbh->{pg_async_status} > 0;
- $dbh->rollback;
+ ## Wrapped in an eval as a failure to serialise can cause
an abort() and the KID will die.
+ eval { $dbh->pg_cancel if $dbh->{pg_async_status} > 0; };
+ ## Seperate eval{} for the rollback as we are probably
still connected to the transaction.
+ eval { $dbh->rollback; };
}
# End the syncrun.
--
Michelle Sullivan
http://www.mhix.org/
More information about the Bucardo-general
mailing list