[Bucardo-general] MCP PID and automatic syncs

Goran Gugic goran.gugic at gmail.com
Thu Jun 18 09:58:20 UTC 2009


On Wed, Jun 17, 2009 at 9:00 PM, Greg Sabino Mullane <greg at endpoint.com>wrote:

>
> > 2) automatic syncs (swap)
> > I have also done some work on testing the connections in a way that
> > might be interesting - using Net::Ping, the syn variant, which checks if
> > the port at dbhost:dbport is open and can be set to timeout very quicky
> > compared to the timeout of the connect_database (which for me was at
> > around 3 minutes for hosts to which the route has been broken, which is
> > the most common scenario that I am trying to address).
>
> This might be better addressed by setting the environment variable
> PGCONNECT_TIMEOUT, which should in theory work with connect_database
> (via DBD::Pg, via libpq).


Thanks, I will try this.


> > My goal here is to have the syncs go up and down automatically depending
> > on the availability of the connections (so that one dropped connection
> > does not ruin a day for all the others)
>
> I'm working on something sort of similar, in that you can set a sync to
> tolerate a disconnected host, such that (for example) someone using a
> master to many slaves configuration would not have to do anything
> special if one of the slaves went down - Bucardo would mark that
> targetdb as inactive and keep going with the others. Then one could
> restart that sync once the targetdb was up again, perhaps with a
> onetimecopy=2 as needed.


That is great and I bet that it will work for what I am trying to do, but in
a proper way (my script is a bloat - an unnecessary layer, controlling the
bucardo_ctl which controls... etc...).

Just a note here - (as far as I understand the results of my own tests) for
(multimaster) swap syncs you don't have to do anything special to bring them
up either (due to deltas): just setting status active and telling MCP to
activate, then possibly kick, should work. Am I right?


>
> > This seems to work now and I am wondering - should I leave it to work
> > around bucardo_ctl or should I try to integrate with bucardo?
>
> Patches welcome: it's not clear how much overlap there is between my
> idea and yours. In general though, such logic should go into bucardo
> itself, and be controlled by config variables and/or a column in the
> sync table.


Yes, my thoughts exactly. Well, since you are working on it and I do have a
working hack that'll serve me for now, I prefer to wait :)

Here is the script I use now. As you'll see my Perl is weak and the thing
has not been cleaned up (1st iterration of working code). Naturally, I
called it bucardo_super_ctl :)) (super stands for - superfluous).

Once again - this was done for multimaster swap syncs with a goal to work
nicely in the environment of unreliable connections. It has not be tested
with any other types of syncs, nor with with any other kind of problems.

PS I would like to discourage any casual reader from actually using this, it
is only posted here to more precisely convey the idea and experience.

------------------------>
#!/usr/bin/perl -w -- -*-cperl-*-

use strict;
use DBI qw(:sql_types);
use Net::Ping;

use Proc::Daemon;
#use Proc::PID::File;

my $QUIET    = 0;

MAIN:
{
    # Daemonize
    Proc::Daemon::Init();

    # If already running, then exit
#    if (Proc::PID::File->running()) {
#        exit(0);
#    }

    # Main loop
    for (;;) {

        &main_worker;
        sleep(15);

    }
}

############################################3
# actual fun
sub main_worker
{

my $dbh = DBI->connect ( 'dbi:Pg:dbname=bucardo',
                         '',
             '',
             {
               RaiseError => 1,
               AutoCommit => 0
             }
                ) || die "Database connection not made: $DBI::errstr";

my @managed_dbs = ( "dev1", "dev2", "dev3" );    # not necessary, see later
managed_syncs, that is enough to get dbhost, dbport joining bucardo.db and
bucardo.sync

my $names = "'" . join("', '", @managed_dbs) . "'";
my $sth = $dbh->prepare( "
SELECT dbhost, dbport, dbuser, dbpass, name
FROM bucardo.db
WHERE name IN ($names);
") || die "Cannot prepare statement: $DBI::errstr\n";

$sth->execute;
my $dbdata = $sth->fetchall_arrayref({ dbhost=>1, dbport=>1, dbuser=>1,
dbpass=>1, name=>1 });
$QUIET or print "PINGING...\n";

my $p = Net::Ping->new("syn");
foreach my $r (@{$dbdata}) {
    $| = 1;
    $QUIET or print "Trying $r->{'dbhost'}:$r->{'dbport'} ";
    $p->port_number($r->{'dbport'});
    $p->ping($r->{'dbhost'},3);
    if ( my( $host, $rtt, $ip) = $p->ack($r->{'dbhost'}) ) {
        $QUIET or print "OK rtt: $rtt, ip: $ip\n";
        $r->{'rtt'} = $rtt;
    } else {
        $QUIET or print "FAIL did not respond\n";
    }

}

#    now we know connections are up
#    read the sync table from the database and we will
#    know which should be up and which should be down
#   compose lists of syncs to inactivate and to reactivate

my @managed_syncs = ( "sync1", "sync2", "sync3" ); # should be a new field
in bucardo.sync
my $syncs = "'" . join("', '", @managed_syncs) . "'";

$sth = $dbh->prepare( "
SELECT name, targetdb, status
FROM bucardo.sync
WHERE name IN ($syncs);
")    || die "Cannot prepare statement: $DBI::errstr\n";

$sth->execute;
my $syncdata = $sth->fetchall_arrayref({ name=>1, targetdb=>1, status=>1 });

# build an idx for the dbdata that gives ping bool for name
my %dbdata_idx = map { $_->{'name'} => defined( $_->{'rtt'} ) } @{$dbdata};

my @reactivate; # list of syncs that'll need to be set to 'active'
my @inactivate; # list of syncs that'll need to be set to 'inactive'
my $rows_affected;

foreach my $sync (@{$syncdata}) {

    if ($sync->{'status'} eq 'inactive') {
        $QUIET or print "$sync->{'name'} is marked inactive in db, ";
        if ( $dbdata_idx{$sync->{'targetdb'}} ) {
            $QUIET or print "but is reachable: needs to be reactivated.\n";
            push(@reactivate, $sync->{'name'});
        } else {
            $QUIET or print "and not reachable - no action needed.\n";
        }
    }

    if ($sync->{'status'} eq 'active') {
        $QUIET or print "$sync->{'name'} is marked active in db, ";
        unless ( $dbdata_idx{$sync->{'targetdb'}} ) {
            $QUIET or print "but did not reply: need to turn it off.\n";
            push(@inactivate, $sync->{'name'});
        } else {
            $QUIET or print "no action needed, supposed to be humming.\n";
        }
    }

}

# restarting it only if it believs it has something to do
# improvement would be to stop only when needs to inactivate
# and not stop when only needs to reactivate
if ( (0 + @inactivate + @reactivate) > 0 ) {

    # first stop it (I could check if it is active,
    system( "bucardo_ctl", "stop", "super controler start" );

    # give it some time to process the stopfile
    # this is not going to be enough if it is timing out
    # i might need to wait for it and check it via ps
    # or bucardo_ctl ping result
    sleep(10);

    # reactivate those that need reactivating
    my $reactivate_names = "'" . join("', '", @reactivate) . "'";
    $sth = $dbh->prepare( "
    UPDATE sync
    SET status = 'active'
    WHERE name IN ( $reactivate_names )
    ") || die "Cannot prepare statement: $DBI::errstr\n";
    $rows_affected = $sth->execute            or die $dbh->errstr;
    $QUIET or print "Reactivated $reactivate_names: $rows_affected\n";

    # inactivate those that need inactivating
    my $inactivate_names = "'" . join("', '", @inactivate) . "'";
    $sth = $dbh->prepare( "
    UPDATE sync
    SET status = 'inactive'
    WHERE name IN ( $inactivate_names )
    ") || die "Cannot prepare statement: $DBI::errstr\n";
    $rows_affected = $sth->execute            or die $dbh->errstr;
    $QUIET or print "Inactivated $inactivate_names: $rows_affected\n";

    # make changes
    $dbh->commit;

    # if bucardo MCP was up and responsive I could probably just inactivate
    # and reactivate directly without bringing it down,
    # but currently if there is a problem with connection
    # it is either respawning (is updating sync table a problem in this
case?)
    # or waiting for db connection to time out and then it'll die so there
is
    # no point to trying to communicate with it
    #
    # now I should also activate the reactivated ones
    #if ( @reactivate ) {
    #        system( "bucardo_ctl", "activate", join(" ", @reactivate) );
    #}

    # and deactivate the inactivated ones (just in case)
    #if ( @inactivate ) {
    #        system( "bucardo_ctl", "deactivate", join(" ", @inactivate) );
    #}

    system( "bucardo_ctl", "start", "super controler start" );

}

$sth->finish;
$dbh->disconnect;

}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mail.endcrypt.com/pipermail/bucardo-general/attachments/20090618/4306ce57/attachment-0001.html 


More information about the Bucardo-general mailing list