[check_postgres] [commit] Add 'checkpoint' action.
Robert Treat
xzilla at users.sourceforge.net
Wed Jan 28 05:26:39 UTC 2009
On Sunday 25 January 2009 15:43:05 check_postgres at bucardo.org wrote:
> Committed by Greg Sabino Mullane <greg at endpoint.com>
>
> Add 'checkpoint' action.
>
> +sub check_checkpoint {
> +
> + ## Checks how long in seconds since the last checkpoint
> + ## Supports: Nagios, MRTG
> + ## Warning and critical are seconds
> + ## Requires $ENV{PGATA} or --datadir
> +
> + my ($warning, $critical) = validate_range
> + ({
> + type => 'time',
> + default_warning => '120',
> + default_critical => '600',
> + forcemrtg => 1,
> + });
> +
> + ## Find the data directory, make sure it exists
> + my $dir = $opt{datadir} || $ENV{PGDATA};
> +
> + if (!defined $dir or ! length $dir) {
> + ndie "Must supply a --datadir argument or set the PGDATA environment
> variable\n"; + }
> +
> + if (! -d $dir) {
> + ndie qq{Invalid data_directory: "$dir"\n};
> + }
> +
> + $db->{host} = '<none>';
> +
> + ## Run pg_controldata, grab the time
> + $COM = "pg_controldata $dir";
> + eval {
> + $res = qx{$COM 2>&1};
> + };
> + if ($@) {
> + ndie "Could not call pg_controldata: $@\n";
> + }
> +
> + if ($res !~ /Time of latest checkpoint:\s*(.+)/) {
> + ndie "Call to pg_controldata $dir failed";
> + }
> + my $last = $1;
> +
> + ## Convert to number of seconds
> + use Date::Parse;
> + my $dt = str2time($last);
> + if ($dt !~ /^\d+$/) {
> + ndie qq{Unable to parse pg_controldata output: "$last"\n};
> + }
> + my $diff = $db->{perf} = time - $dt;
> +
> + my $msg = sprintf "Last checkpoint was $diff %s ago",
> + $diff == 1 ? 'second' : 'seconds';
> +=head2 B<checkpoint>
> +
> +(C<symlink: check_postgres_checkpoint>) Determines how long since the last
> checkpoint has +been run. This must run on the same server as the database
> that is being checked. The +data directory must be set, either by the
> environment variable C<PGDATA>, or passing +the C<--datadir> argument. It
> returns the number of seconds since the last checkpoint +was run, as
> determined by parsing the call to C<pg_controldata>. Because of this, the
> +pg_controldata executable must be available in the current path.
> +
> +At least one warning or critical argument must be set.
> +
> +For MRTG or simple output, returns the number of seconds.
> +
Just curious, but since the check has to run on the db server, why not have it
log into the db to look at checkpoint_timeout, and then base warning/critical
based on that? Also, it might be worth looking if the db is in backup mode.
--
Robert Treat
Conjecture: http://www.xzilla.net
Consulting: http://www.omniti.com
More information about the Check_postgres
mailing list