[Bucardo-general] Swap replication latency

Tue May 10 11:41:19 UTC 2011

Hi Greg,

On 05/09/2011 11:28 PM, Greg Sabino Mullane wrote:

> The swap syncs in Bucardo 4 are not very optimized, but they should
> be able to keep up with that volume. You can try adjusting some of
> the 'sleep' values in the config downwards to minimize how long
> things take.

Tried that, to no apparent effect.

> You could also consider turning off 'ping' for the sync, and thus
> removing the NOTIFY triggers, and have the sync use checktime
> instead.

That I have not tried, but is there any reason to think it would 
result in higher throughput, assuming that the change operations are 
fairly uniformly distributed throughout time (they more or less are)? 
  I would think this would be advantageous only if there are 
occasional batched operations.

> However, it's more likely there is some other problem - having a look
> at the log.bucardo file would be the best way for us to diagnose from
> afar, but some general tips that may apply to your situation or not:

I will attempt to provide that soon-ish.

> * Keep bucardo_delta and bucardo_track small by aggressive purging
> via cron and the bucardo_purge_delta function. Same for the q table.
> If they ever get big, make sure you VACUUM them as well - preferably
> VACUUM FULL then REINDEX if needed. The bucardo_delta table should
> only have thousands of rows, even on a busy server.

I purge both with a '1 minute'::interval.  My 'q' table has never 
gotten larger than a handful of rows (about 6), but as soon as 
throughput starts slipping, my 'bucardo_delta' table gets quite large 
- tens of thousands to hundreds of thousands of rows, and in one case 
where it went unnoticed for a long time, 13 million!  None/few of 
these purge even with a '1 minute' interval, so something is getting 
very acutely backed up.

> * Minimize the number of tables in a sync. It's better in Bucardo 4
> to have many sync with few tables rather than few syncs with many
> tables each.

I'll definitely experiment with that approach.  However, in general, 
we have about 50 tables, about 45 of which are very static and hardly 
ever change.  The sync run time for those 45 is never above 5 seconds. 
  But as soon as I add one--just one--of the higher-volume tables, it 
balloons out.  Would it really be faster to have, say, the one 
high-volume table that I'm currently using as an anchor for my 
experiments in one sync, and the static tables in another?

> * If using a pre-9 version of Postgres, make sure you are aggressively
> vacuuming the pg_listener table.
>
> * If using a pre-8.3 version of Postgres, also vacuum the pg_class
> table aggressively (and think hard about upgrading!)

I'm running 9.0.4.

Thanks for the insights!

-- 
Alex Balashov - Principal
Evariste Systems LLC
260 Peachtree Street NW
Suite 2200
Atlanta, GA 30303
Tel: +1-678-954-0670
Fax: +1-404-961-1892
Web: http://www.evaristesys.com/