[Bucardo-general] Performance option...

Michelle Sullivan michelle at sorbs.net
Sun Aug 22 18:46:41 UTC 2010


Greg Sabino Mullane wrote:
> On Sat, Aug 07, 2010 at 02:34:04PM +0200, Michelle Sullivan wrote:
>   
>> Ok 2 days after resolving my swap issues it's still trying to process
>> the sync... still over 1 day of swapping (3G RAM on the machine, 32G
>> swap, and the process is just over 20G)..
>>
>> May I make a suggestion - have an option to 'Tie::Hash' to a file and
>> avoid the huge amount of swapping... maybe a manual config option that
>> will warn of the delays it can cause, or perhaps a quick count of the
>> delta table and if there is over 100k entries (configurable)
>> automatically use it...?
>>     
>
> Sounds okay to me, although I'd have to poke around in the code to 
> see how much we really can offload like that, I seem to recall there 
> might be some issues with the stuff passed to the customcodes.
>
>   


Poked! ;-)

I haven't patched because I haven't spent enough time on it yet..

However the current code for me at line 3992 reads:

                ## Note: column order important for splice and defined
calls later
                $SQL{delta} = qq{
                SELECT    DISTINCT $drow,
                          BUCARDO_PK $aliaslist
                FROM      bucardo.bucardo_delta d
                LEFT JOIN $S.$T t ON BUCARDO_JOIN
                WHERE     d.tablename = \$1::oid
                AND       NOT EXISTS (
                                SELECT 1
                                FROM   bucardo.bucardo_track bt
                                WHERE  d.txntime = bt.txntime
                                AND    bt.targetdb = '\$2'::text
                                AND    bt.tablename = \$1::oid
                          )
                };

And produces the following for me:

         :                 SELECT    DISTINCT d.rowid AS "BUCARDO_ID",
         :                           t.rawid
,msgid,msgdeliverer,msgsender,msgchksum,msgreceived,msgbody
         :                 FROM      bucardo.bucardo_delta d
         :                 LEFT JOIN public.rawevidence t ON
(t.rawid::int8 = d.rowid::int8)
         :                 WHERE     d.tablename = 9830363::oid
         :                 AND       NOT EXISTS (
         :                                 SELECT 1
         :                                 FROM   bucardo.bucardo_track bt
         :                                 WHERE  d.txntime = bt.txntime
         :                                 AND    bt.targetdb =
'usmaster'::text
         :                                 AND    bt.tablename =
9830363::oid
         :                           )

and results in the following plan:

                                                             QUERY
PLAN                                                             
-------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=47071254.31..47191235.13 rows=5332481 width=646)
   ->  Sort  (cost=47071254.31..47084585.51 rows=5332481 width=646)
         Sort Key: d.rowid, t.rawid, t.msgid, t.msgdeliverer,
t.msgsender, t.msgchksum, t.msgreceived, t.msgbody
         ->  Hash Left Join  (cost=31728837.52..43413429.19 rows=5332481
width=646)
               Hash Cond: ((d.rowid)::bigint = t.rawid)
               ->  Hash Anti Join  (cost=89012.19..1661423.01
rows=5332481 width=8)
                     Hash Cond: (d.txntime = bt.txntime)
                     ->  Bitmap Heap Scan on bucardo_delta d 
(cost=88998.43..1114157.90 rows=5333946 width=16)
                           Recheck Cond: (tablename = 9830363::oid)
                           ->  Bitmap Index Scan on
bucardo_delta_public_rawevidence_txn  (cost=0.00..87664.95 rows=5333946
width=0)
                     ->  Hash  (cost=13.65..13.65 rows=1 width=8)
                           ->  Bitmap Heap Scan on bucardo_track bt 
(cost=9.55..13.65 rows=1 width=8)
                                 Recheck Cond: ((tablename =
9830363::oid) AND (targetdb = 'usmaster'::text))
                                 ->  Bitmap Index Scan on
bucardo_track_target  (cost=0.00..9.55 rows=1 width=0)
                                       Index Cond: ((tablename =
9830363::oid) AND (targetdb = 'usmaster'::text))
               ->  Hash  (cost=14328846.10..14328846.10 rows=94309651
width=638)
                     ->  Seq Scan on rawevidence t 
(cost=0.00..14328846.10 rows=94309651 width=638)
(17 rows)


rawevidence is around 100G in size and it's on the end of a U320 scsi
raid... so sequence scans take a _LONG_ time not to mention the bucado
process grows to 22G in size ;-)

however... if we say select 100 transactions at a time it's a *LOT*
faster and uses SIGNIFICANLTY less memory...

To get the next 100 transactions I can use the following:

   SELECT    DISTINCT d.rowid AS "BUCARDO_ID",
             t.rawid
,msgid,msgdeliverer,msgsender,msgchksum,msgreceived,msgbody
   FROM     
                (
                        SELECT *
                        FROM
                                bucardo.bucardo_delta dq 
                        WHERE dq.txntime IN ( SELECT DISTINCT txntime
FROM bucardo.bucardo_delta ORDER BY txntime ASC LIMIT 100)
                ) AS d
   LEFT JOIN public.rawevidence t ON (t.rawid::int8 = d.rowid::int8)
   WHERE     d.tablename = 9830363::oid
   AND       NOT EXISTS ( 
                   SELECT 1               
                   FROM   bucardo.bucardo_track bt
                   WHERE  d.txntime = bt.txntime 
                   AND    bt.targetdb = 'usmaster'::text
                   AND    bt.tablename = 9830363::oid
             )                      
 which results in the following EXPLAIN ANALYSEd plan...

                                                                                            
QUERY
PLAN                                                                                             

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=49638.62..49655.09 rows=732 width=646) (actual
time=581.484..812.658 rows=100 loops=1)
   ->  Sort  (cost=49638.62..49640.45 rows=732 width=646) (actual
time=581.475..582.608 rows=100 loops=1)
         Sort Key: dq.rowid, t.rawid, t.msgid, t.msgdeliverer,
t.msgsender, t.msgchksum, t.msgreceived, t.msgbody
         Sort Method:  quicksort  Memory: 184kB
         ->  Nested Loop Left Join  (cost=13.68..49603.79 rows=732
width=646) (actual time=276.941..580.238 rows=100 loops=1)
               ->  Merge Anti Join  (cost=13.67..35848.37 rows=732
width=8) (actual time=228.245..313.818 rows=100 loops=1)
                     Merge Cond: (dq.txntime = bt.txntime)
                     ->  Nested Loop  (cost=0.00..35832.77 rows=732
width=16) (actual time=147.860..232.276 rows=100 loops=1)
                           ->  Limit  (cost=0.00..32449.49 rows=100
width=8) (actual time=121.297..201.556 rows=100 loops=1)
                                 ->  Unique  (cost=0.00..236359778.56
rows=728393 width=8) (actual time=121.283..200.725 rows=100 loops=1)
                                       ->  Index Scan using
bucardo_delta_txntime on bucardo_delta  (cost=0.00..236201313.14
rows=63386168 width=8) (actual time=121.270..193.033 rows=1764 loops=1)
                           ->  Index Scan using
bucardo_delta_public_rawevidence_txn on bucardo_delta dq 
(cost=0.00..33.02 rows=7 width=16) (actual time=0.283..0.288 rows=1
loops=100)
                                 Index Cond: (dq.txntime =
bucardo_delta.txntime)
                     ->  Sort  (cost=13.66..13.67 rows=1 width=8)
(actual time=80.364..80.364 rows=0 loops=1)
                           Sort Key: bt.txntime
                           Sort Method:  quicksort  Memory: 17kB
                           ->  Bitmap Heap Scan on bucardo_track bt 
(cost=9.55..13.65 rows=1 width=8) (actual time=80.310..80.310 rows=0
loops=1)
                                 Recheck Cond: ((tablename =
9830363::oid) AND (targetdb = 'usmaster'::text))
                                 ->  Bitmap Index Scan on
bucardo_track_target  (cost=0.00..9.55 rows=1 width=0) (actual
time=80.290..80.290 rows=0 loops=1)
                                       Index Cond: ((tablename =
9830363::oid) AND (targetdb = 'usmaster'::text))
               ->  Index Scan using rawrvidence_id_key on rawevidence t 
(cost=0.00..18.68 rows=1 width=638) (actual time=2.523..2.641 rows=1
loops=100)
                     Index Cond: (t.rawid = (dq.rowid)::bigint)
 Total runtime: 813.551 ms
(23 rows)

increasing the limit to 10000 we get:

                                                                                              
QUERY
PLAN                                                                                              

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=3629002.56..3629035.50 rows=1464 width=646) (actual
time=23684.650..26793.872 rows=9980 loops=1)
   ->  Sort  (cost=3629002.56..3629006.22 rows=1464 width=646) (actual
time=23684.638..23849.531 rows=9980 loops=1)
         Sort Key: dq.rowid, t.rawid, t.msgid, t.msgdeliverer,
t.msgsender, t.msgchksum, t.msgreceived, t.msgbody
         Sort Method:  external sort  Disk: 13128kB
         ->  Nested Loop Left Join  (cost=0.00..3628925.59 rows=1464
width=646) (actual time=231.287..22502.182 rows=9980 loops=1)
               ->  Nested Loop Anti Join  (cost=0.00..3601414.76
rows=1464 width=8) (actual time=201.236..5545.415 rows=9980 loops=1)
                     ->  Nested Loop  (cost=0.00..3581477.64 rows=1465
width=16) (actual time=137.619..5186.774 rows=9980 loops=1)
                           ->  Limit  (cost=0.00..3244948.85 rows=10000
width=8) (actual time=102.368..4538.531 rows=10000 loops=1)
                                 ->  Unique  (cost=0.00..236359802.56
rows=728393 width=8) (actual time=102.356..4447.767 rows=10000 loops=1)
                                       ->  Index Scan using
bucardo_delta_txntime on bucardo_delta  (cost=0.00..236201337.14
rows=63386168 width=8) (actual time=102.335..3556.971 rows=176013 loops=1)
                           ->  Index Scan using
bucardo_delta_public_rawevidence_txn on bucardo_delta dq 
(cost=0.00..32.84 rows=7 width=16) (actual time=0.031..0.042 rows=1
loops=10000)
                                 Index Cond: (dq.txntime =
bucardo_delta.txntime)
                     ->  Index Scan using bucardo_track_target on
bucardo_track bt  (cost=0.00..13.61 rows=1 width=8) (actual
time=0.020..0.020 rows=0 loops=9980)
                           Index Cond: ((bt.tablename = 9830363::oid)
AND (dq.txntime = bt.txntime) AND (bt.targetdb = 'usmaster'::text))
               ->  Index Scan using rawrvidence_id_key on rawevidence t 
(cost=0.00..18.68 rows=1 width=638) (actual time=1.666..1.675 rows=1
loops=9980)
                     Index Cond: (t.rawid = (dq.rowid)::bigint)
 Total runtime: 26864.025 ms
(17 rows)


Would a patch like this work?

Thanks,

Shells




More information about the Bucardo-general mailing list