[check_postgres] [commit] Add 'checkpoint' action.

Robert Treat xzilla at users.sourceforge.net
Wed Jan 28 05:26:39 UTC 2009


On Sunday 25 January 2009 15:43:05 check_postgres at bucardo.org wrote:
> Committed by Greg Sabino Mullane <greg at endpoint.com>
>
> Add 'checkpoint' action.
>
> +sub check_checkpoint {
> +
> +	## Checks how long in seconds since the last checkpoint
> +	## Supports: Nagios, MRTG
> +	## Warning and critical are seconds
> +	## Requires $ENV{PGATA} or --datadir
> +
> +	my ($warning, $critical) = validate_range
> +		({
> +		  type              => 'time',
> +		  default_warning   => '120',
> +		  default_critical  => '600',
> +		  forcemrtg         => 1,
> +	  });
> +
> +	## Find the data directory, make sure it exists
> +	my $dir = $opt{datadir} || $ENV{PGDATA};
> +
> +	if (!defined $dir or ! length $dir) {
> +		ndie "Must supply a --datadir argument or set the PGDATA environment
> variable\n"; +	}
> +
> +	if (! -d $dir) {
> +		ndie qq{Invalid data_directory: "$dir"\n};
> +	}
> +
> +	$db->{host} = '<none>';
> +
> +	## Run pg_controldata, grab the time
> +	$COM = "pg_controldata $dir";
> +	eval {
> +		$res = qx{$COM 2>&1};
> +	};
> +	if ($@) {
> +		ndie "Could not call pg_controldata: $@\n";
> +	}
> +
> +	if ($res !~ /Time of latest checkpoint:\s*(.+)/) {
> +		ndie "Call to pg_controldata $dir failed";
> +	}
> +	my $last = $1;
> +
> +	## Convert to number of seconds
> +	use Date::Parse;
> +	my $dt = str2time($last);
> +	if ($dt !~ /^\d+$/) {
> +		ndie qq{Unable to parse pg_controldata output: "$last"\n};
> +	}
> +	my $diff = $db->{perf} = time - $dt;
> +
> +	my $msg = sprintf "Last checkpoint was $diff %s ago",
> +		$diff == 1 ? 'second' : 'seconds';

> +=head2 B<checkpoint>
> +
> +(C<symlink: check_postgres_checkpoint>) Determines how long since the last
> checkpoint has +been run. This must run on the same server as the database
> that is being checked. The +data directory must be set, either by the
> environment variable C<PGDATA>, or passing +the C<--datadir> argument. It
> returns the number of seconds since the last checkpoint +was run, as
> determined by parsing the call to C<pg_controldata>. Because of this, the
> +pg_controldata executable must be available in the current path.
> +
> +At least one warning or critical argument must be set.
> +
> +For MRTG or simple output, returns the number of seconds.
> +

Just curious, but since the check has to run on the db server, why not have it 
log into the db to look at checkpoint_timeout, and then base warning/critical 
based on that? Also, it might be worth looking if the db is in backup mode. 

-- 
Robert Treat
Conjecture: http://www.xzilla.net
Consulting: http://www.omniti.com


More information about the Check_postgres mailing list