[Bucardo-general] bucardo error with bulk copy

Tue Nov 13 04:41:03 UTC 2018

Hello David,

Thank you so much for your reply. Below are my Tcp_keepalive settings for
this server

*Instance where Bucardo is running:-*

tcp_keepalive_time=60

tcp_keepalive_probes=3

tcp_keepalive_intvl=10

With this, my TCP keepalive timeout to 60 seconds with 3 probes,
10 seconds gap between each.

*Destination Database (RDS Instance):-*

tcp_keepalive_time=200

tcp_keepalive_probes=5

tcp_keepalive_intvl=200

*My source DB has this:-*

net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 1500

On Tue, Nov 13, 2018 at 2:57 AM David Christensen <david at endpoint.com>
wrote:

> > Dear Team,
> >
> > I am stuck with an error in Bucardo and require some help to proceed
> further.
> >
> > So i have a Postgres database which i am trying to migrate. In order
> make sure the migration only takes a minimum downtime, i was using the bulk
> copy mode where in my "onetimecopy" is set to 2 so that once the bulk copy
> is completed, it moves to syncing the delta.
> >
> > My problem is that few of the tables in my DB are big. I tried doing a
> migration of database without including the big tables and it worked fine.
> However when i include the bigger table, it tries and migrates those bigger
> tables (For instance this table alone takes 6 hours for Bucardo to migrate).
> >
> > I see per logs that it did successfully copy data from that table and
> then Bucardo fails with the below error.
> >
> >  Kid has died, error is: DBD::Pg::st execute failed: SSL SYSCALL error:
> EOF detected at Bucardo.pm line 4985. Line: 5453 Main DB state: 08000
> Error: 7 DB dest_db state: ? Error: none DB source_db state: ? Error: none
> >
> > I am running my bucardo on an AWS Memory optimized R4 24X large and the
> whole process of Bucardo is not even using 5% of the system, so i know its
> not a memory issue.
> >
> > Any directions on this will be really appreciated.
>
> Hi Vineeth,
>
> What are your tcp_keepalive settings for this server?  It’s possible it’s
> silently breaking the connection due to perceived inactivity whilst
> building indexes or some other long-running process that isn’t transmitting
> data while the connection is open.  I’ve seen adjusting these so the
> network knows the connection is alive help with some issues like this.
>
> Best,
>
> David
> --
> David Christensen
> End Point Corporation
> david at endpoint.com
> 785-727-1171
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.endcrypt.com/pipermail/bucardo-general/attachments/20181113/6a1d5e33/attachment.html>