[Bucardo-general] master slave not syncing after db upgrade. (triggers disabled)

Mon Aug 13 18:28:22 UTC 2012

jtkells wrote:
> On Sat, 11 Aug 2012 18:18:25 -0500
> Rosser Schwarz <rosser.schwarz at gmail.com> wrote:
>
>   
>> Can you connect to your Bucardo database and say "SELECT
>> validate_all_syncs();" and report what happens?
>>
>> rls
>>
>> On Mon, Aug 6, 2012 at 3:57 PM, jtkells <jtkells at verizon.net> wrote:
>>     
>>> On Mon, 6 Aug 2012 09:27:32 -0400
>>> jtkells <jtkells at verizon.net> wrote:
>>>
>>>       
>>>> Hi,
>>>>
>>>> I'm having a bit of trouble here getting a master to slave
>>>> replication environment working after a database schema upgrade.
>>>> I am using bucardo 4.4.8 on a postgreSQL 8..4.8 database
>>>>
>>>> I have been running this master slave configuration for a long
>>>> time. We recently updated our schema(adding a lot of new columns
>>>> etc. to a lot of tables). To accommodate these changes I performed
>>>> the following steps:
>>>> 1) I stop bucardo
>>>> 2) I remove all tables from bucardo
>>>> 3) I remove the herd that these tables belonged to
>>>> On the database side (Master)
>>>> I drop the schema and recreate the schema and all its tables (new
>>>> columns)
>>>> I load the tables through program code which generates millions of
>>>> records to these tables (100).
>>>> I do a pg_dump of this schema and copy it over to the slave
>>>> database On the slave database:
>>>> I drop all the replicated tables and run pg_restore.
>>>>
>>>> On both system I analyze these tables
>>>> On the master database I
>>>> 4) I add the tables back into bucardo
>>>> 5) I create the herd for them
>>>> 6) and I start bucardo
>>>> Bucardo goes through checks and generates the following record for
>>>> each of the tables
>>>> [Mon Aug  6 09:16:07 2012]  CTL   Herd member 19494511:
>>>> ac_5300_18b_esri.fence
>>>> [Mon Aug  6 09:16:07 2012]  CTL     Target oids: agis_slave:4480021
>>>>
>>>> I update some columns in a table to test replication and nothing
>>>> happens. I have tried to do several commands to get bucardo to
>>>> start processing the new changes (reload, kick etc.) but still
>>>> nothing.  I suspect the "Latest bad reason: Controller cleaning
>>>> out unstarted q entry  " is causing the problem but not sure how
>>>> to fix this? Should I have deleted the sync's?
>>>>
>>>>
>>>> Name     Type  State PID   Last_good Time  I/U/D Last_bad Time
>>>> ========+=====+=====+=====+=========+=====+=====+========+====
>>>> agis_18b| P   |idle |12596|4m39s    |9s   |0/0/0|25m49s  |0s
>>>>
>>>>
>>>> Sync: agis_18b  (pushdelta)  esri18b =>  agis_slave  (Active)
>>>>
>>>>
>>>> postgres at arp-db:~$ bucardo_ctl status agis_18b
>>>> Days back: 3  User: bucardo  Database: bucardo
>>>> ======================================================================
>>>> Sync name:            agis_18b
>>>> Current state:        idle (PID = 12596)
>>>> Type:                 pushdelta
>>>> Source herd/database: esri18b / agis_master
>>>> Target database:      agis_slave
>>>> Tables in sync:       100
>>>> Last good:            5m 25s (time to run: 9s)
>>>> Last good time:       Aug 06, 2012 09:16:17  Target: agis_slave
>>>> Ins/Upd/Del:          0 / 0 / 0
>>>> Last bad:             26m 35s (time to run: 0s)
>>>> Last bad time:        Aug 06, 2012 08:55:07  Target: agis_slave
>>>> Latest bad reason: Controller cleaning out unstarted q entry
>>>> PID file:             /tmp/bucardo.ctl.sync.agis_18b.pid
>>>> PID file created:     Mon Aug  6 09:16:07 2012
>>>> Status:               active
>>>> Limitdbs:             0
>>>> Priority:             0
>>>> Checktime:            none
>>>> Overdue time:         00:00:00
>>>> Expired time:         00:00:00
>>>> Stayalive:            yes      Kidsalive: yes
>>>> Rebuild index:        0        Do_listen: no
>>>> Ping:                 yes      Makedelta: no
>>>> Onetimecopy:          0
>>>>
>>>>
>>>>
>>>> Thanking you in advance
>>>>         
>>> Further investigation I updated some records and saw that no
>>> entries in the q table were created.  There are triggers on the
>>> tables but looking at the triggers in pg_trigger table I find that
>>> the triggers are disabled (tgenabled = FALSE in pg_trigger table).
>>> What process did I miss in bucardo that caused this (I dropped the
>>> tables and herd)?  If I didn't miss anything is it safe to enable
>>> these triggers at the PostgreSQL level and is there anything else I
>>> need to do?  Also, was there anything else that I should have done
>>> when I was removing the tables and herds in the first place?
>>>
>>> Thanking you in advance
>>> _______________________________________________
>>> Bucardo-general mailing list
>>> Bucardo-general at bucardo.org
>>> https://mail.endcrypt.com/mailman/listinfo/bucardo-general
>>>       
>>
>>     
>
> Rosser,
>
> bucardo=# select bucardo.validate_all_syncs();
>  validate_all_syncs 
> --------------------
>                   1
>
> I tried the bucardo_ctl validate sync command back when I was having the
> problem and it reporting no issues.  
> I had stated that the triggers weren't enabled but I was wrong in
> stating that.  The column tgenabled in pg_trigger showed o which I
> assumed to be false but later realized that o was origin and they were
> enabled. So for now I'm not sure why it stalled and how it started
> at a later point in time.  
>
> I will be repeating this step within the next few days and will have
> more control on watching the process and outcome.  
> _______________________________________________
> Bucardo-general mailing list
> Bucardo-general at bucardo.org
> https://mail.endcrypt.com/mailman/listinfo/bucardo-general
>   
FYI this is the same problem I have (I did report it earlier - but
hadn't looked really closely at the PG tables to know it was disabled in
the DB)

Regards,

Michelle

-- 
Michelle Sullivan
http://www.mhix.org/