[Bucardo-general] Word of warning - when queries get too backed up...

Michelle Sullivan michelle at sorbs.net
Tue Jul 27 11:42:30 UTC 2010


Noticed a few 'out of swap' messages in the syslog today...

This was showing in log.bucardo:

[Tue Jul 27 06:22:51 2010] CTL Rows updated child 84516 to aborted in q: 1
[Tue Jul 27 06:22:51 2010] CTL Warning! Kid 84516 seems to have died.
Sync "sorbs2_swap"
[Tue Jul 27 06:22:52 2010] CTL Could not add to q
sync=sorbs2_swap,source=aumaster,target=usmaster,count=1. Sending manual
notification
[Tue Jul 27 06:22:52 2010] CTL Creating a kid
[Tue Jul 27 06:22:52 2010] CTL Created new kid 95328 for sync
"sorbs2_swap" to database "usmaster"
[Tue Jul 27 06:22:54 2010] KID New kid, syncs "aumaster" to "usmaster"
for sync "sorbs2_swap" alive=1 Parent=33125 Type=swap PID=95328
[Tue Jul 27 06:23:02 2010] CTL Timed out - force a sync for "sorbs2_swap"
[Tue Jul 27 06:23:02 2010] CTL Could not add to q
sync=sorbs2_swap,source=aumaster,target=usmaster,count=1. Sending manual
notification
[Tue Jul 27 06:23:09 2010] KID Bucardo database backend PID is 95330
[Tue Jul 27 06:23:15 2010] KID Source database backend PID is 48273
[Tue Jul 27 06:23:15 2010] KID Target database backend PID is 57539
[Tue Jul 27 06:23:17 2010] CTL Timed out - force a sync for "sorbs2_swap"
[Tue Jul 27 06:23:17 2010] CTL Could not add to q
sync=sorbs2_swap,source=aumaster,target=usmaster,count=1. Sending manual
notification
[Tue Jul 27 06:23:27 2010] KID Got a notice for sorbs2_swap: aumaster ->
usmaster
[Tue Jul 27 06:23:32 2010] CTL Timed out - force a sync for "sorbs2_swap"
[Tue Jul 27 06:23:47 2010] CTL Timed out - force a sync for "sorbs2_swap"

The query running was:

SELECT    DISTINCT d.rowid AS
"BUCARDO_ID",                                                                                                                                                                

                           t.rawid
,msgid,msgdeliverer,msgsender,msgchksum,msgreceived,msgbody                                                                                                                              

                 FROM      bucardo.bucardo_delta
d                                                                                                                                                                          

                 LEFT JOIN public.rawevidence t ON (t.rawid::int8 =
d.rowid::int8)                                                                                                                                          

                 WHERE     d.tablename =
9830363::oid                                                                                                                                                                       

                 AND       NOT EXISTS
(                                                                                                                                                                                     

                                 SELECT
1                                                                                                                                                                                   

                                 FROM   bucardo.bucardo_track
bt                                                                                                                                                            

                                 WHERE  d.txntime =
bt.txntime                                                                                                                                                              

                                 AND    bt.targetdb =
'usmaster'::text                                                                                                                                                      

                                 AND    bt.tablename =
9830363::oid                                                                                                                                                         

                          
)                                                                                                                                                                                                


... which with 4G ram and 2G swap ran the perl process out of memory...
:-/ (msgbody = bytea of unlimited size)

I've added an extra 32G of swap to see if it'll go through and catch up,
but I suspect it might be hitting the max data seg size - though it is
freeBSD 7.x amd64 so it shouldn't.... well not until it hits:
33,554,432k (32G?)

Might want to see if the data can be chunked - or written to disk if
it's really impossible because of constraints... Maybe automatically
write any table data to a file if it has rows of type that can be large
(eg, text and bytea)..?  Maybe a query for the primary keys then make a
guess-timate of whether to write to a temp file on disk or write to
memory? ... the latter will be a lot faster performing in the case of a
large data chunk as perl does have this annoying habit of realloc'ing
small chunks at a time when it hit's a large hash size (speaking from
experience on that one - pre-allocating chunks of memory will help, but
perl is not efficient at handling large data sets in a single var in
memory - large being several 100k)

Shells



More information about the Bucardo-general mailing list