[Bucardo-general] New to Bucardo - Issuses with Large Tables

Sat Jan 22 05:58:50 UTC 2022

If you can dump/restore the tables it is often quicker than the onetime copy...  if you use onetime copy it will insert into the indexes as it inserts the rows (by chunks that are COPYed).. Hence the speed difference.  For tables of such a large size personally I don’t use the onetimecopy approach as it does run into locking and timeout issues...  iirc if this is the only method, check the config for timeout settings... and check the logs it is actually a timeout and not a deadlock or any stateful device (firewall) inbetween timing out the connection due to inactivity (whilst indexes are building) etc...

Regards,

Michelle Sullivan
http://www.mhix.org/
You’re very neat about the feet.

> On 22 Jan 2022, at 10:13, Wayne Taylor <wtaylor at g2.com> wrote:
> 
> Hi Everyone,
> 
> I am new to the community and love the tooling and had some questions regarding some issues we've faced with larger databases. 
> 
> As some background:
> We are in the process of migrating from Heroku to AWS Aurora RDS. We initially tried out DMS but faced issues. We then took the WAL-E approach - whilst this worked this introduced some unnecessary risks for us such as a double step migration. Thus intro to Bucardo. 
> 
> We have successfully migrated several databases - but recently found some issues with larger databases e.g. we have one database in particular that has 5 tables consuming 750GB total with the biggest table over 350GB in size. Whilst the tasks run - we find after 24 hrs (not sure why it takes so long) the task's timeout and restart again.
> 
> We run bucardo on an EC2 instance with Postgres installed (just empty shell)
> 
> Mitigations we are taking right now:
> We have some apps that are 100% completely backed by Kafka and thus we can take them offline and recover later. For now, a pg_dump and pg_restore with parallel jobs allow us to achieve a full restore in approx 7 hrs. 
> 
> My hopes:
> Determine a way to get Bucardo to run the migration for larger tables without hitting timeouts - ideally parallelized to speed up time or settings that prevent the timeout from occurring. 
> 
> Thank you community and happy to provide more background
> 
> Best,
> Wayne
> 
> _______________________________________________
> Bucardo-general mailing list
> Bucardo-general at bucardo.org
> https://bucardo.org/mailman/listinfo/bucardo-general
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://bucardo.org/pipermail/bucardo-general/attachments/20220122/85336b34/attachment-0001.htm>