Перейти к основному содержимому

PostgreSQL by Zabbix agent

Macros used

NameValue
{$PG.CACHE_HITRATIO.MIN.WARN}90
{$PG.CHECKPOINTS_REQ.MAX.WARN}5
{$PG.CONFLICTS.MAX.WARN}0
{$PG.CONN_IDLE_IN_TRANS.MAX.WARN}5
{$PG.CONN_TOTAL_PCT.MAX.WARN}90
{$PG.CONN_WAIT.MAX.WARN}0
{$PG.DB}postgres
{$PG.DEADLOCKS.MAX.WARN}0
{$PG.FROZENXID_PCT_STOP.MIN.HIGH}75
{$PG.HOST}127.0.0.1
{$PG.LLD.FILTER.DBNAME}(.*)
{$PG.LOCKS.MAX.WARN}100
{$PG.PASSWORD}

-

{$PG.PING_TIME.MAX.WARN}1s
{$PG.PORT}5432
{$PG.QUERY_ETIME.MAX.WARN}30
{$PG.REPL_LAG.MAX.WARN}10m
{$PG.SLOW_QUERIES.MAX.WARN}5
{$PG.TRANS_ACTIVE.MAX.WARN}30s
{$PG.TRANS_IDLE.MAX.WARN}30s
{$PG.TRANS_WAIT.MAX.WARN}30s
{$PG.USER}zbx_monitor

Items collected

NameDescriptionTypeIntervalKey and additional info
Bgwriter: Buffers allocated per secondNumber of buffers allocatedDEPENDENT

-

pgsql.bgwriter.buffers_alloc.rate
Bgwriter: Buffers written directly by a backend per secondNumber of buffers written directly by a backendDEPENDENT

-

pgsql.bgwriter.buffers_backend.rate
Bgwriter: Buffers backend fsync per secondNumber of times a backend had to execute its own fsync call (normally the background writer handles those even when the backend does its own write)DEPENDENT

-

pgsql.bgwriter.buffers_backend_fsync.rate
Bgwriter: Buffers written during checkpoints per secondNumber of buffers written during checkpointsDEPENDENT

-

pgsql.bgwriter.buffers_checkpoint.rate
Bgwriter: Buffers written by the background writer per secondNumber of buffers written by the background writerDEPENDENT

-

pgsql.bgwriter.buffers_clean.rate
Bgwriter: Requested checkpoints per secondNumber of requested checkpoints that have been performedDEPENDENT

-

pgsql.bgwriter.checkpoints_req.rate
Bgwriter: Scheduled checkpoints per secondNumber of scheduled checkpoints that have been performedDEPENDENT

-

pgsql.bgwriter.checkpoints_timed.rate
Bgwriter: Checkpoint sync timeTotal amount of time that has been spent in the portion of checkpoint processing where files are synchronized to diskDEPENDENT

-

pgsql.bgwriter.checkpoint_sync_time
Bgwriter: Checkpoint write timeTotal amount of time that has been spent in the portion of checkpoint processing where files are written to disk, in millisecondsDEPENDENT

-

pgsql.bgwriter.checkpoint_write_time
Bgwriter: Max written per secondNumber of times the background writer stopped a cleaning scan because it had written too many buffersDEPENDENT

-

pgsql.bgwriter.maxwritten_clean.rate
PostgreSQL: Get bgwriterStatistics about the background writer process's activity

-

-

pgsql.bgwriter["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Status: Cache hit ratio %Cache hit ratio

-

-

pgsql.cache.hit["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Status: Config hashPostgreSQL configuration hash

-

15mpgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Connections sum: ActiveTotal number of connections executing a queryDEPENDENT

-

pgsql.connections.sum.active
Connections sum: IdleTotal number of connections waiting for a new client commandDEPENDENT

-

pgsql.connections.sum.idle
Connections sum: Idle in transactionTotal number of connections in a transaction state, but not executing a queryDEPENDENT

-

pgsql.connections.sum.idle_in_transaction
Connections sum: PreparedTotal number of prepared transactions https://www.postgresql.org/docs/current/sql-prepare-transaction.htmlDEPENDENT

-

pgsql.connections.sum.prepared
Connections sum: TotalTotal number of connectionsDEPENDENT

-

pgsql.connections.sum.total
Connections sum: Total %Total number of connections in percentageDEPENDENT

-

pgsql.connections.sum.total_pct
Connections sum: WaitingTotal number of waiting connections https://www.postgresql.org/docs/current/monitoring-stats.html#WAIT-EVENT-TABLEDEPENDENT

-

pgsql.connections.sum.waiting
PostgreSQL: Get connections sumCollect all metrics from pg_stat_activity https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW

-

-

pgsql.connections.sum["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
PostgreSQL: Get dbstatCollect all metrics from pg_stat_database per database https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-DATABASE-VIEW

-

-

pgsql.dbstat["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
PostgreSQL: Get locksCollect all metrics from pg_locks per database https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-TABLES

-

-

pgsql.locks["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Status: Ping time

-

-

-

pgsql.ping.time["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Status: Ping

-

-

-

pgsql.ping["{$PG.HOST}","{$PG.PORT}"]
PostgreSQL: Get queriesCollect all metrics by query execution time

-

-

pgsql.queries["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}","{$PG.QUERY_ETIME.MAX.WARN}"]
Replication: standby countNumber of standby servers

-

-

pgsql.replication.count["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Replication: lag in secondsReplication lag with Master in seconds

-

-

pgsql.replication.lag.sec["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Replication: recovery roleReplication role: 1 — recovery is still in progress (standby mode), 0 — master mode.

-

-

pgsql.replication.recovery_role["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Replication: statusReplication status: 0 — streaming is down, 1 — streaming is up, 2 — master mode

-

-

pgsql.replication.status["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Transactions: Max active transaction timeCurrent max active transaction timeDEPENDENT

-

pgsql.transactions.active
Transactions: Max idle transaction timeCurrent max idle transaction timeDEPENDENT

-

pgsql.transactions.idle
Transactions: Max prepared transaction timeCurrent max prepared transaction timeDEPENDENT

-

pgsql.transactions.prepared
Transactions: Max waiting transaction timeCurrent max waiting transaction timeDEPENDENT

-

pgsql.transactions.waiting
PostgreSQL: Get transactionsCollect metrics by transaction execution time

-

-

pgsql.transactions["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Status: Uptime

-

-

-

pgsql.uptime["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
Status: VersionPostgreSQL version

-

15mpgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
WAL: Segments countNumber of WAL segmentsDEPENDENT

-

pgsql.wal.count
PostgreSQL: Get WALMaster item to collect WAL metrics

-

5mpgsql.wal.stat["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]
WAL: Bytes writtenWAL write in bytesDEPENDENT

-

pgsql.wal.write

Triggers

NameDescriptionExpressionPriorityDependencies
PostgreSQL: Required checkpoints occurs too frequentlyCheckpoints are points in the sequence of transactions at which it is guaranteed that the heap and index data files have been updated with all information written before that checkpoint. At checkpoint time, all dirty data pages are flushed to disk and a special checkpoint record is written to the log file. https://www.postgresql.org/docs/current/wal-configuration.htmllast(/PostgreSQL by Zabbix agent/pgsql.bgwriter.checkpoints_req.rate) > {$PG.CHECKPOINTS_REQ.MAX.WARN}AVERAGE ⚠Bgwriter: Requested checkpoints per second
PostgreSQL: Failed to get itemsZabbix has not received data for items for the last 30 minutesnodata(/PostgreSQL by Zabbix agent/pgsql.bgwriter["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],30m) = 1WARNING 📢PostgreSQL: Get bgwriter
PostgreSQL: Cache hit ratio too low

-

max(/PostgreSQL by Zabbix agent/pgsql.cache.hit["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m) < {$PG.CACHE_HITRATIO.MIN.WARN}WARNING 📢Status: Cache hit ratio %
PostgreSQL: Configuration has changed

-

last(/PostgreSQL by Zabbix agent/pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#1)<>last(/PostgreSQL by Zabbix agent/pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#2) and length(last(/PostgreSQL by Zabbix agent/pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]))>0INFO 🔔Status: Config hash
PostgreSQL: Total number of connections is too high

-

min(/PostgreSQL by Zabbix agent/pgsql.connections.sum.total_pct,5m) > {$PG.CONN_TOTAL_PCT.MAX.WARN}AVERAGE ⚠Connections sum: Total %
PostgreSQL: Response too long

-

min(/PostgreSQL by Zabbix agent/pgsql.ping.time["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m) > {$PG.PING_TIME.MAX.WARN}AVERAGE ⚠Status: Ping time
PostgreSQL: Service is down

-

last(/PostgreSQL by Zabbix agent/pgsql.ping["{$PG.HOST}","{$PG.PORT}"]) = 0HIGH ⛔Status: Ping
PostgreSQL: Streaming lag with {#MASTER} is too high

-

min(/PostgreSQL by Zabbix agent/pgsql.replication.lag.sec["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m) > {$PG.REPL_LAG.MAX.WARN}AVERAGE ⚠Replication: lag in seconds
PostgreSQL: Replication is down

-

max(/PostgreSQL by Zabbix agent/pgsql.replication.status["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m)=0AVERAGE ⚠Replication: status
PostgreSQL: Service has been restartedPostgreSQL uptime is less than 10 minuteslast(/PostgreSQL by Zabbix agent/pgsql.uptime["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]) < 10mINFO 🔔Status: Uptime
PostgreSQL: Version has changed

-

last(/PostgreSQL by Zabbix agent/pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#1)<>last(/PostgreSQL by Zabbix agent/pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#2) and length(last(/PostgreSQL by Zabbix agent/pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]))>0INFO 🔔Status: Version

Discovery rule №1

NameDescriptionTypeIntervalKey and additional info
Database discovery

-

-

1hpgsql.discovery.db["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]

Item prototypes

NameDescriptionTypeIntervalKey and additional info
DB {#DBNAME}: Database sizeDatabase size

-

15mpgsql.db.size["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}","{#DBNAME}"]
DB {#DBNAME}: Blocks hit per secondTotal number of times disk blocks were found already in the buffer cache, so that a read was not necessaryDEPENDENT

-

pgsql.dbstat.blks_hit.rate["{#DBNAME}"]
DB {#DBNAME}: Disk blocks read per secondTotal number of disk blocks read in this databaseDEPENDENT

-

pgsql.dbstat.blks_read.rate["{#DBNAME}"]
DB {#DBNAME}: Detected conflicts per secondTotal number of queries canceled due to conflicts with recovery in this databaseDEPENDENT

-

pgsql.dbstat.conflicts.rate["{#DBNAME}"]
DB {#DBNAME}: Detected deadlocks per secondTotal number of detected deadlocks in this databaseDEPENDENT

-

pgsql.dbstat.deadlocks.rate["{#DBNAME}"]
DB {#DBNAME}: Temp_bytes written per secondTotal amount of data written to temporary files by queries in this databaseDEPENDENT

-

pgsql.dbstat.temp_bytes.rate["{#DBNAME}"]
DB {#DBNAME}: Temp_files created per secondTotal number of temporary files created by queries in this databaseDEPENDENT

-

pgsql.dbstat.temp_files.rate["{#DBNAME}"]
DB {#DBNAME}: Tuples deleted per secondTotal number of rows deleted by queries in this databaseDEPENDENT

-

pgsql.dbstat.tup_deleted.rate["{#DBNAME}"]
DB {#DBNAME}: Tuples fetched per secondTotal number of rows fetched by queries in this databaseDEPENDENT

-

pgsql.dbstat.tup_fetched.rate["{#DBNAME}"]
DB {#DBNAME}: Tuples inserted per secondTotal number of rows inserted by queries in this databaseDEPENDENT

-

pgsql.dbstat.tup_inserted.rate["{#DBNAME}"]
DB {#DBNAME}: Tuples returned per secondTotal number of rows updated by queries in this databaseDEPENDENT

-

pgsql.dbstat.tup_returned.rate["{#DBNAME}"]
DB {#DBNAME}: Tuples updated per secondTotal number of rows updated by queries in this databaseDEPENDENT

-

pgsql.dbstat.tup_updated.rate["{#DBNAME}"]
DB {#DBNAME}: Commits per secondNumber of transactions in this database that have been committedDEPENDENT

-

pgsql.dbstat.xact_commit.rate["{#DBNAME}"]
DB {#DBNAME}: Rollbacks per secondTotal number of transactions in this database that have been rolled backDEPENDENT

-

pgsql.dbstat.xact_rollback.rate["{#DBNAME}"]
DB {#DBNAME}: Frozen XID before avtovacuum %reventing Transaction ID Wraparound Failures https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUNDDEPENDENT

-

pgsql.frozenxid.prc_before_av["{#DBNAME}"]
DB {#DBNAME}: Frozen XID before stop %Preventing Transaction ID Wraparound Failures https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUNDDEPENDENT

-

pgsql.frozenxid.prc_before_stop["{#DBNAME}"]
DB {#DBNAME}: Get frozen XID

-

-

-

pgsql.frozenxid["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"]
DB {#DBNAME}: Locks totalTotal number of locks in the databaseDEPENDENT

-

pgsql.locks.total["{#DBNAME}"]
DB {#DBNAME}: Queries slow maintenance countSlow maintenance query countDEPENDENT

-

pgsql.queries.mro.slow_count["{#DBNAME}"]
DB {#DBNAME}: Queries max maintenance timeMax maintenance query timeDEPENDENT

-

pgsql.queries.mro.time_max["{#DBNAME}"]
DB {#DBNAME}: Queries sum maintenance timeSum maintenance query timeDEPENDENT

-

pgsql.queries.mro.time_sum["{#DBNAME}"]
DB {#DBNAME}: Queries slow query countSlow query countDEPENDENT

-

pgsql.queries.query.slow_count["{#DBNAME}"]
DB {#DBNAME}: Queries max query timeMax query timeDEPENDENT

-

pgsql.queries.query.time_max["{#DBNAME}"]
DB {#DBNAME}: Queries sum query timeSum query timeDEPENDENT

-

pgsql.queries.query.time_sum["{#DBNAME}"]
DB {#DBNAME}: Queries slow transaction countSlow transaction query countDEPENDENT

-

pgsql.queries.tx.slow_count["{#DBNAME}"]
DB {#DBNAME}: Queries max transaction timeMax transaction query timeDEPENDENT

-

pgsql.queries.tx.time_max["{#DBNAME}"]
DB {#DBNAME}: Queries sum transaction timeSum transaction query timeDEPENDENT

-

pgsql.queries.tx.time_sum["{#DBNAME}"]
DB {#DBNAME}: Index scans per secondNumber of index scans in the databaseDEPENDENT

-

pgsql.scans.idx.rate["{#DBNAME}"]
DB {#DBNAME}: Sequential scans per secondNumber of sequential scans in the databaseDEPENDENT

-

pgsql.scans.seq.rate["{#DBNAME}"]
DB {#DBNAME}: Get scansNumber of scans done for table/index in the database

-

-

pgsql.scans["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"]

Trigger prototypes

NameDescriptionExpressionPriorityDependencies
DB {#DBNAME}: Too many recovery conflictsThe primary and standby servers are in many ways loosely connected. Actions on the primary will have an effect on the standby. As a result, there is potential for negative interactions or conflicts between them. https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICTmin(/PostgreSQL by Zabbix agent/pgsql.dbstat.conflicts.rate["{#DBNAME}"],5m) > {$PG.CONFLICTS.MAX.WARN:"{#DBNAME}"}AVERAGE ⚠DB {#DBNAME}: Detected conflicts per second
DB {#DBNAME}: Deadlock occurred

-

min(/PostgreSQL by Zabbix agent/pgsql.dbstat.deadlocks.rate["{#DBNAME}"],5m) > {$PG.DEADLOCKS.MAX.WARN:"{#DBNAME}"}HIGH ⛔DB {#DBNAME}: Detected deadlocks per second
DB {#DBNAME}: VACUUM FREEZE is required to prevent wraparoundPreventing Transaction ID Wraparound Failures https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUNDlast(/PostgreSQL by Zabbix agent/pgsql.frozenxid.prc_before_stop["{#DBNAME}"])<{$PG.FROZENXID_PCT_STOP.MIN.HIGH:"{#DBNAME}"}AVERAGE ⚠DB {#DBNAME}: Frozen XID before stop %
DB {#DBNAME}: Number of locks is too high

-

min(/PostgreSQL by Zabbix agent/pgsql.locks.total["{#DBNAME}"],5m)>{$PG.LOCKS.MAX.WARN:"{#DBNAME}"}WARNING 📢DB {#DBNAME}: Locks total
DB {#DBNAME}: Too many slow queries

-

min(/PostgreSQL by Zabbix agent/pgsql.queries.query.slow_count["{#DBNAME}"],5m)>{$PG.SLOW_QUERIES.MAX.WARN:"{#DBNAME}"}WARNING 📢DB {#DBNAME}: Queries slow query count