[Bucardo-general] master slave not syncing after db upgrade. (triggers disabled)
Michelle Sullivan
michelle at sorbs.net
Mon Aug 13 18:28:22 UTC 2012
jtkells wrote:
> On Sat, 11 Aug 2012 18:18:25 -0500
> Rosser Schwarz <rosser.schwarz at gmail.com> wrote:
>
>
>> Can you connect to your Bucardo database and say "SELECT
>> validate_all_syncs();" and report what happens?
>>
>> rls
>>
>> On Mon, Aug 6, 2012 at 3:57 PM, jtkells <jtkells at verizon.net> wrote:
>>
>>> On Mon, 6 Aug 2012 09:27:32 -0400
>>> jtkells <jtkells at verizon.net> wrote:
>>>
>>>
>>>> Hi,
>>>>
>>>> I'm having a bit of trouble here getting a master to slave
>>>> replication environment working after a database schema upgrade.
>>>> I am using bucardo 4.4.8 on a postgreSQL 8..4.8 database
>>>>
>>>> I have been running this master slave configuration for a long
>>>> time. We recently updated our schema(adding a lot of new columns
>>>> etc. to a lot of tables). To accommodate these changes I performed
>>>> the following steps:
>>>> 1) I stop bucardo
>>>> 2) I remove all tables from bucardo
>>>> 3) I remove the herd that these tables belonged to
>>>> On the database side (Master)
>>>> I drop the schema and recreate the schema and all its tables (new
>>>> columns)
>>>> I load the tables through program code which generates millions of
>>>> records to these tables (100).
>>>> I do a pg_dump of this schema and copy it over to the slave
>>>> database On the slave database:
>>>> I drop all the replicated tables and run pg_restore.
>>>>
>>>> On both system I analyze these tables
>>>> On the master database I
>>>> 4) I add the tables back into bucardo
>>>> 5) I create the herd for them
>>>> 6) and I start bucardo
>>>> Bucardo goes through checks and generates the following record for
>>>> each of the tables
>>>> [Mon Aug 6 09:16:07 2012] CTL Herd member 19494511:
>>>> ac_5300_18b_esri.fence
>>>> [Mon Aug 6 09:16:07 2012] CTL Target oids: agis_slave:4480021
>>>>
>>>> I update some columns in a table to test replication and nothing
>>>> happens. I have tried to do several commands to get bucardo to
>>>> start processing the new changes (reload, kick etc.) but still
>>>> nothing. I suspect the "Latest bad reason: Controller cleaning
>>>> out unstarted q entry " is causing the problem but not sure how
>>>> to fix this? Should I have deleted the sync's?
>>>>
>>>>
>>>> Name Type State PID Last_good Time I/U/D Last_bad Time
>>>> ========+=====+=====+=====+=========+=====+=====+========+====
>>>> agis_18b| P |idle |12596|4m39s |9s |0/0/0|25m49s |0s
>>>>
>>>>
>>>> Sync: agis_18b (pushdelta) esri18b => agis_slave (Active)
>>>>
>>>>
>>>> postgres at arp-db:~$ bucardo_ctl status agis_18b
>>>> Days back: 3 User: bucardo Database: bucardo
>>>> ======================================================================
>>>> Sync name: agis_18b
>>>> Current state: idle (PID = 12596)
>>>> Type: pushdelta
>>>> Source herd/database: esri18b / agis_master
>>>> Target database: agis_slave
>>>> Tables in sync: 100
>>>> Last good: 5m 25s (time to run: 9s)
>>>> Last good time: Aug 06, 2012 09:16:17 Target: agis_slave
>>>> Ins/Upd/Del: 0 / 0 / 0
>>>> Last bad: 26m 35s (time to run: 0s)
>>>> Last bad time: Aug 06, 2012 08:55:07 Target: agis_slave
>>>> Latest bad reason: Controller cleaning out unstarted q entry
>>>> PID file: /tmp/bucardo.ctl.sync.agis_18b.pid
>>>> PID file created: Mon Aug 6 09:16:07 2012
>>>> Status: active
>>>> Limitdbs: 0
>>>> Priority: 0
>>>> Checktime: none
>>>> Overdue time: 00:00:00
>>>> Expired time: 00:00:00
>>>> Stayalive: yes Kidsalive: yes
>>>> Rebuild index: 0 Do_listen: no
>>>> Ping: yes Makedelta: no
>>>> Onetimecopy: 0
>>>>
>>>>
>>>>
>>>> Thanking you in advance
>>>>
>>> Further investigation I updated some records and saw that no
>>> entries in the q table were created. There are triggers on the
>>> tables but looking at the triggers in pg_trigger table I find that
>>> the triggers are disabled (tgenabled = FALSE in pg_trigger table).
>>> What process did I miss in bucardo that caused this (I dropped the
>>> tables and herd)? If I didn't miss anything is it safe to enable
>>> these triggers at the PostgreSQL level and is there anything else I
>>> need to do? Also, was there anything else that I should have done
>>> when I was removing the tables and herds in the first place?
>>>
>>> Thanking you in advance
>>> _______________________________________________
>>> Bucardo-general mailing list
>>> Bucardo-general at bucardo.org
>>> https://mail.endcrypt.com/mailman/listinfo/bucardo-general
>>>
>>
>>
>
> Rosser,
>
> bucardo=# select bucardo.validate_all_syncs();
> validate_all_syncs
> --------------------
> 1
>
> I tried the bucardo_ctl validate sync command back when I was having the
> problem and it reporting no issues.
> I had stated that the triggers weren't enabled but I was wrong in
> stating that. The column tgenabled in pg_trigger showed o which I
> assumed to be false but later realized that o was origin and they were
> enabled. So for now I'm not sure why it stalled and how it started
> at a later point in time.
>
> I will be repeating this step within the next few days and will have
> more control on watching the process and outcome.
> _______________________________________________
> Bucardo-general mailing list
> Bucardo-general at bucardo.org
> https://mail.endcrypt.com/mailman/listinfo/bucardo-general
>
FYI this is the same problem I have (I did report it earlier - but
hadn't looked really closely at the PG tables to know it was disabled in
the DB)
Regards,
Michelle
--
Michelle Sullivan
http://www.mhix.org/
More information about the Bucardo-general
mailing list