[Bucardo-general] master slave not syncing after db upgrade. (triggers disabled)
jtkells
jtkells at verizon.net
Mon Aug 13 13:53:29 UTC 2012
On Sat, 11 Aug 2012 18:18:25 -0500
Rosser Schwarz <rosser.schwarz at gmail.com> wrote:
> Can you connect to your Bucardo database and say "SELECT
> validate_all_syncs();" and report what happens?
>
> rls
>
> On Mon, Aug 6, 2012 at 3:57 PM, jtkells <jtkells at verizon.net> wrote:
> > On Mon, 6 Aug 2012 09:27:32 -0400
> > jtkells <jtkells at verizon.net> wrote:
> >
> >> Hi,
> >>
> >> I'm having a bit of trouble here getting a master to slave
> >> replication environment working after a database schema upgrade.
> >> I am using bucardo 4.4.8 on a postgreSQL 8..4.8 database
> >>
> >> I have been running this master slave configuration for a long
> >> time. We recently updated our schema(adding a lot of new columns
> >> etc. to a lot of tables). To accommodate these changes I performed
> >> the following steps:
> >> 1) I stop bucardo
> >> 2) I remove all tables from bucardo
> >> 3) I remove the herd that these tables belonged to
> >> On the database side (Master)
> >> I drop the schema and recreate the schema and all its tables (new
> >> columns)
> >> I load the tables through program code which generates millions of
> >> records to these tables (100).
> >> I do a pg_dump of this schema and copy it over to the slave
> >> database On the slave database:
> >> I drop all the replicated tables and run pg_restore.
> >>
> >> On both system I analyze these tables
> >> On the master database I
> >> 4) I add the tables back into bucardo
> >> 5) I create the herd for them
> >> 6) and I start bucardo
> >> Bucardo goes through checks and generates the following record for
> >> each of the tables
> >> [Mon Aug 6 09:16:07 2012] CTL Herd member 19494511:
> >> ac_5300_18b_esri.fence
> >> [Mon Aug 6 09:16:07 2012] CTL Target oids: agis_slave:4480021
> >>
> >> I update some columns in a table to test replication and nothing
> >> happens. I have tried to do several commands to get bucardo to
> >> start processing the new changes (reload, kick etc.) but still
> >> nothing. I suspect the "Latest bad reason: Controller cleaning
> >> out unstarted q entry " is causing the problem but not sure how
> >> to fix this? Should I have deleted the sync's?
> >>
> >>
> >> Name Type State PID Last_good Time I/U/D Last_bad Time
> >> ========+=====+=====+=====+=========+=====+=====+========+====
> >> agis_18b| P |idle |12596|4m39s |9s |0/0/0|25m49s |0s
> >>
> >>
> >> Sync: agis_18b (pushdelta) esri18b => agis_slave (Active)
> >>
> >>
> >> postgres at arp-db:~$ bucardo_ctl status agis_18b
> >> Days back: 3 User: bucardo Database: bucardo
> >> ======================================================================
> >> Sync name: agis_18b
> >> Current state: idle (PID = 12596)
> >> Type: pushdelta
> >> Source herd/database: esri18b / agis_master
> >> Target database: agis_slave
> >> Tables in sync: 100
> >> Last good: 5m 25s (time to run: 9s)
> >> Last good time: Aug 06, 2012 09:16:17 Target: agis_slave
> >> Ins/Upd/Del: 0 / 0 / 0
> >> Last bad: 26m 35s (time to run: 0s)
> >> Last bad time: Aug 06, 2012 08:55:07 Target: agis_slave
> >> Latest bad reason: Controller cleaning out unstarted q entry
> >> PID file: /tmp/bucardo.ctl.sync.agis_18b.pid
> >> PID file created: Mon Aug 6 09:16:07 2012
> >> Status: active
> >> Limitdbs: 0
> >> Priority: 0
> >> Checktime: none
> >> Overdue time: 00:00:00
> >> Expired time: 00:00:00
> >> Stayalive: yes Kidsalive: yes
> >> Rebuild index: 0 Do_listen: no
> >> Ping: yes Makedelta: no
> >> Onetimecopy: 0
> >>
> >>
> >>
> >> Thanking you in advance
> >
> >
> > Further investigation I updated some records and saw that no
> > entries in the q table were created. There are triggers on the
> > tables but looking at the triggers in pg_trigger table I find that
> > the triggers are disabled (tgenabled = FALSE in pg_trigger table).
> > What process did I miss in bucardo that caused this (I dropped the
> > tables and herd)? If I didn't miss anything is it safe to enable
> > these triggers at the PostgreSQL level and is there anything else I
> > need to do? Also, was there anything else that I should have done
> > when I was removing the tables and herds in the first place?
> >
> > Thanking you in advance
> > _______________________________________________
> > Bucardo-general mailing list
> > Bucardo-general at bucardo.org
> > https://mail.endcrypt.com/mailman/listinfo/bucardo-general
>
>
>
Rosser,
bucardo=# select bucardo.validate_all_syncs();
validate_all_syncs
--------------------
1
I tried the bucardo_ctl validate sync command back when I was having the
problem and it reporting no issues.
I had stated that the triggers weren't enabled but I was wrong in
stating that. The column tgenabled in pg_trigger showed o which I
assumed to be false but later realized that o was origin and they were
enabled. So for now I'm not sure why it stalled and how it started
at a later point in time.
I will be repeating this step within the next few days and will have
more control on watching the process and outcome.
More information about the Bucardo-general
mailing list