[Bucardo-general] Test Failures: Serialized Isolation

David E. Wheeler david at justatheory.com
Wed Oct 24 16:05:49 UTC 2012


On Oct 23, 2012, at 8:37 PM, Greg Sabino Mullane <greg at endpoint.com> wrote:

> So, because we are talking very asynchronously, I want to put the 
> proposal out there clearly for everyone. Because a serialization 
> error is a known (and frankly expected) event on busy systems, we 
> should treat that as very different from all other errors that a 
> KID may encounter. Specifically, we need to try again, without 
> reporting a serious problem back to the client via listen/notify.

+1 This seems like the sane thing to do.

> We should continue the sleep setting to be sure, but should it give up 
> after X tries? Slowly increment the sleep over time? I'm strongly 
> inclined to do neither of those, but thought I should throw it 
> out there.

I would:

* Retry immediately with no sleep
* If that fails, sleep for 1s and retry again
* Keep repeating every 1s
* Die after 10 or 50 or something.

> In addition to trying again (whether via cleanup and a goto KID, 
> or asking the controller to start up a new kid), we should have a 
> new notify that is fired to let listeners know that yeah, the sync 
> failed, but it's only a serialization error and we will try again. 
> The payload should tell how long we are sleeping, and perhaps some 
> other information (e.g. which table it was on when this occurred).
> By "listener" I basically mean the bucardo program.

+1 This makes sense. Also, the --retry option should no longer be needed, right?

> Of course, it would be nice to find a good way to cause serialization 
> errors on demand for the test suite; I seem to recall trying to do 
> so once and fialing, but I'm sure it is possible somehow.

We could add a test to run only on 8.4+ that mimics it using PL/pgSQL:

david=# \set VERBOSITY verbose
david=# DO $$BEGIN RAISE EXCEPTION 'Serialization error'
        USING ERRCODE = 'serialization_failure'; END $$;
ERROR:  40001: Serialization error
LOCATION:  exec_stmt_raise, pl_exec.c:2840

Best,

David



More information about the Bucardo-general mailing list