[Bucardo-general] Bad Plan for Bucardo Query

Fri Sep 20 20:26:07 UTC 2019

Fellow Bucardans,

We've run into an issue where replication can seriously lag. One minute we're getting < 1s lag, the next it's up to 20s. This is causing problems for tables with non-primary unique constraints: Customers are creating usernames and submitting them twice in less than 20s, and because they have an equal chance of hitting either of our two masters, the very long lag can lead to unique constraint violations and the killing of syncs.

I've worked out a method to resolve unique conflicts in custom code, but I would also like to see if we can fix the lag. The problem is that the planner suddenly switches from one plan to another, and the new, bad plan kills performance. The code in question is in this function:

  https://github.com/bucardo/bucardo/blob/acc21e807ea135b08b4502e5973c31d403b97f8a/bucardo.schema#L1499

We managed to replicate the issue with this query:

    select (select count(*) from bucardo.delta_config_users) d,(select count(*) from bucardo.track_config_users) t;

When it's working fine, the plan looks like this:

     Limit  (cost=11158.29..15550.30 rows=1 width=0) (actual time=129.648..129.648 rows=0 loops=1)
   ->  Merge Anti Join  (cost=11158.29..15550.30 rows=1 width=0) (actual time=129.647..129.647 rows=0 loops=1)
         Merge Cond: (d.txntime = t.txntime)
         ->  Index Only Scan using dex1_config_users on delta_config_users d  (cost=0.29..2697.87 rows=98307 width=8) (actual time=0.074..10.932 rows=98380 loops=1)
               Heap Fetches: 4322
         ->  Sort  (cost=11157.96..11316.00 rows=63218 width=8) (actual time=102.265..104.908 rows=67198 loops=1)
               Sort Key: t.txntime
               Sort Method: quicksort  Memory: 6222kB
               ->  Seq Scan on track_config_users t  (cost=0.00..6116.94 rows=63218 width=8) (actual time=0.053..89.108 rows=67198 loops=1)
                     Filter: ((target = 'dbgroup configuration_dbs'::text) OR (target ~ '^T:'::text))
                     Rows Removed by Filter: 188175

  https://explain.depesz.com/s/X5i4

But then it suddenly shifts to:

    Limit  (cost=0.00..2575.59 rows=1 width=0) (actual time=42928.330..42928.330 rows=0 loops=1)
    ->  Nested Loop Anti Join  (cost=0.00..10845806.73 rows=4211 width=0) (actual time=42928.327..42928.327 rows=0 loops=1)
            Join Filter: (d.txntime = t.txntime)
            Rows Removed by Join Filter: 429203342
            ->  Seq Scan on delta_config_users d  (cost=0.00..491.82 rows=31682 width=8) (actual time=0.003..4.954 rows=31750 loops=1)
            ->  Materialize  (cost=0.00..3196.22 rows=23733 width=8) (actual time=0.000..0.485 rows=13519 loops=31750)
                ->  Seq Scan on track_config_users t  (cost=0.00..3077.55 rows=23733 width=8) (actual time=0.019..29.834 rows=25109 loops=1)
                        Filter: ((target = 'dbgroup configuration_dbs'::text) OR (target ~ '^T:'::text))
                        Rows Removed by Filter: 65218

  https://explain.depesz.com/s/X5i4

The difference appears to be a switch from a Merge Anti Join to a Nested Loop Anti Join. We vacuumed both delta_config_users) and track_config_users, which for some reason fixed the problem, but only for a while.  

Anyone run into something like this before? Can we adjust the table statistics to fix the problem?

Oh, and why are there so many records in the deltas table? I thought bucardo cleaned records out once replication completed.

Thanks,

David