[check_postgres] patch for slony_status check

Tue Feb 18 00:10:56 UTC 2014

The slony_status check appears to only consider the first row in the result
set.  If you have more than one node in replication, it seems that you
probably want to report on the most lagged server, so I think ORDERING by the
lagtime is the best way to go.

Here's a little patch:

diff --git a/check_postgres.pl b/check_postgres.pl
index fae344f..73a99c1 100755
--- a/check_postgres.pl
+++ b/check_postgres.pl
@@ -7421,7 +7421,8 @@ q{SELECT
  COALESCE(n2.no_comment, '') AS com2
 FROM SCHEMA.sl_status
 JOIN SCHEMA.sl_node n1 ON (n1.no_id=st_origin)
-JOIN SCHEMA.sl_node n2 ON (n2.no_id=st_received)};
+JOIN SCHEMA.sl_node n2 ON (n2.no_id=st_received)
+ORDER BY 1 DESC};
 
     my $maxlagtime = -1;


It's possible my perl isn't up to snuff and there's some magic happening, but
it seems like it used to loop over the rows returned from the query before
commit 0e9973e31628b33feef8fcd829123c455743f958 and now it doesn't.

I'm guessing that this:

            $maxlagtime = $lag if $lag > $maxlagtime;

is a remnant of the old loop, but just adding an ORDER BY is probably sufficient.


-- 
Jeff Frost <jeff at pgexperts.com>
CTO, PostgreSQL Experts, Inc.
Phone: 1-888-PG-EXPRT x506
FAX: 415-762-5122
http://www.pgexperts.com/