[Bucardo-general] Eek - can't figure out how to fix this one....

Sun Aug 1 20:26:11 UTC 2010

Michelle Sullivan wrote:
> Karl Denninger wrote:
>   
>> Michelle Sullivan wrote:
>>     
>>> Karl Denninger wrote:
>>>   
>>>       
>>>> Ah, it appears that when the installation script runs, it creates the
>>>> user and sets a non-default search path of it's own.
>>>>
>>>> Hmmm... that sounds like a bug, as it's definitely undocumented - and
>>>> furthermore, if the remote and local hosts are different (and thus you
>>>> created the "bucardo" user on the remote using the usual "createuser"
>>>> command, you suddenly have a problem just like this.
>>>>
>>>> Will investigate; it does appear that there may be fruitful results
>>>> found here....
>>>>     
>>>>         
>>> And me I used pgsql on the masters and slave so I would not run into
>>> this problem - which would explain why I haven't seen it ;-)
>>>
>>> Michelle
>>>   
>>>       
>> That appears to have been the problem - sticking a specific "set
>> search_path" in the role for the bucardo account on the slave fixed it.
>>
>> Incidentally, my interest in this is due to SLONY blowing chunks on me
>> after a few years of successful use - it apparently LOST a handful of
>> syncs (!!), resulting in an out-of-sync database table - a large
>> delete then failed quite a long time later and hosed me, as there was
>> no way to get that table back in sync - Slony cannot be told to
>> re-copy only ONE table out of a set, and the (valid) DELETEs on the
>> master could never complete on the slave.  Permanent toilet-stoppage
>> there.
>>     
>
> I couldn't get slony working on anything after my DB exceeded 20G..!
>   
I haven't run into THAT yet.  And I have multiple databases under
replication with data sets in total well over a terabyte, but thus far
none of them individually are over 200GB or so.

A 200GB full resync, however, is very un-funny especially when the slave
that's "out" is your HA spare and across a WAN link.  Even with a
100Mbps pipe you can stuff pretty hard it takes a damn long time - many
hours.
>>  Slony had some sort
>> of problem with memory management with 2.0.3 - I was on 2.0.2 with
>> Postgres 8.4.4 for a good long while, but 2.0.4 refuses to resync the
>> suspect table, as did 2.0.2, so I'm good and solidly hosed right now.
>>     
> New install of 2.04 failed to even start for me.
>   
2.0.4 works fine for me other than this memory problem.  But that's
fatal in my application since I never know when it's going to bite me.

>> It also appears that Bucardo does not lock the slave tables with a
>> trigger to prevent modifications on the slave nodes.
>>     
>
> That is documented and quite deliberate (i believe)... Bucardo can do
> multi (2) master mode - so you wouldn't lock both masters.  However,
> master to slave would indeed give rise to problems if one wrote to the
> slave that broke a referral integrity check for a subsequent write by
> the master.  That said as has been explained to me Bucardo uses
> delete/insert for operation so if there is data written to a slave it is
> overwritten by master updates should there be a clash.
>   
Yeah, I just have to be really careful application-wise.  I have
High-Availability/Clustering in my application code - Slony was my "belt
and suspenders" against a coding error.  Bucardo won't be.
> I have run into a number of issues - however Bucardo's self recovery of
> out of sync tables is impressive.  Seems that if you can run multiple
> syncs then you can have a very nice system, however if like me you have
> 58 tables of which 54 are linked to each other with foreign key
> constraints (particularly to an audit table) then you have to sync the
> DB with one sync and the time that takes is quite large (was 20 minutes
> per sync - but I am inserting a number of rows ranging from 1m to 10m
> per day against multiple tables in that sync.)
>
> Michelle
>   
Essentially ALL of the tables in the databases under replication here
are subject to foreign key constraints.  That limits screwup damage but
if the replication system hoses you're you're REALLY hosed as you get
deadlocked immediately.  In theory provided the replication system keeps
the ordering of transactions all should be ok - but if that breaks for
any reason.....

-- Karl

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mail.endcrypt.com/pipermail/bucardo-general/attachments/20100801/75f20a0b/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: karl.vcf
Type: text/x-vcard
Size: 124 bytes
Desc: not available
Url : https://mail.endcrypt.com/pipermail/bucardo-general/attachments/20100801/75f20a0b/attachment-0001.vcf