[check_postgres] Detection of paused PgBouncer

Guillaume Lelarge guillaume at lelarge.info
Sun Apr 14 08:58:27 UTC 2013


On Thu, 2013-03-28 at 20:08 +0100, Cyril Bouthors wrote:
> Hi,
> 
> First of all, I have to thank you for the excellent quality level that you
> provided with check-postgres and the time it saved us when configuring Nagios to
> check both PostgreSQL and PgBouncer.
> 
> We had a major downtime today on one of our live PostgreSQL cluster because a
> maintenance script failed and paused PgBouncer [1] without resuming it [2].
> 
> PgBouncer is checked with Nagios by the following check-postgres scripts:
> 
>  - pgbouncer_backends
>  - pgb_pool_maxwait
>  - pgb_pool_cl_waiting
>  - pgb_pool_sv_active
> 
> Unfortunately, none of them detected that PgBouncer was unable to handle queries
> because it was paused.
> 
> Is there any particular check-postgres script that we can use to make sure that
> this can be detected earlier?
> 
> If so, which script?
> 
> If not, would it be possible to check this? Maybe pgbouncer_backends should
> check that?
> 

If you pause a database, all connections to this database will be
disconnected. So you should have seen a drop in the pgbouncer_backends
return. Did you?

Moreover, AFAICT, there's nothing in the pgbouncer catalogs that could
tell us that a database has been paused. Kinda weird, though.


-- 
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com



More information about the Check_postgres mailing list