PostgreSQL by Zabbix agent
Macros used
| Name | Value |
|---|---|
| {$PG.CACHE_HITRATIO.MIN.WARN} | 90 |
| {$PG.CHECKPOINTS_REQ.MAX.WARN} | 5 |
| {$PG.CONFLICTS.MAX.WARN} | 0 |
| {$PG.CONN_IDLE_IN_TRANS.MAX.WARN} | 5 |
| {$PG.CONN_TOTAL_PCT.MAX.WARN} | 90 |
| {$PG.CONN_WAIT.MAX.WARN} | 0 |
| {$PG.DB} | postgres |
| {$PG.DEADLOCKS.MAX.WARN} | 0 |
| {$PG.FROZENXID_PCT_STOP.MIN.HIGH} | 75 |
| {$PG.HOST} | 127.0.0.1 |
| {$PG.LLD.FILTER.DBNAME} | (.*) |
| {$PG.LOCKS.MAX.WARN} | 100 |
| {$PG.PASSWORD} | - |
| {$PG.PING_TIME.MAX.WARN} | 1s |
| {$PG.PORT} | 5432 |
| {$PG.QUERY_ETIME.MAX.WARN} | 30 |
| {$PG.REPL_LAG.MAX.WARN} | 10m |
| {$PG.SLOW_QUERIES.MAX.WARN} | 5 |
| {$PG.TRANS_ACTIVE.MAX.WARN} | 30s |
| {$PG.TRANS_IDLE.MAX.WARN} | 30s |
| {$PG.TRANS_WAIT.MAX.WARN} | 30s |
| {$PG.USER} | zbx_monitor |
Items collected
| Name | Description | Type | Interval | Key and additional info |
|---|---|---|---|---|
| Bgwriter: Buffers allocated per second | Number of buffers allocated | DEPENDENT | - | pgsql.bgwriter.buffers_alloc.rate |
| Bgwriter: Buffers written directly by a backend per second | Number of buffers written directly by a backend | DEPENDENT | - | pgsql.bgwriter.buffers_backend.rate |
| Bgwriter: Buffers backend fsync per second | Number of times a backend had to execute its own fsync call (normally the background writer handles those even when the backend does its own write) | DEPENDENT | - | pgsql.bgwriter.buffers_backend_fsync.rate |
| Bgwriter: Buffers written during checkpoints per second | Number of buffers written during checkpoints | DEPENDENT | - | pgsql.bgwriter.buffers_checkpoint.rate |
| Bgwriter: Buffers written by the background writer per second | Number of buffers written by the background writer | DEPENDENT | - | pgsql.bgwriter.buffers_clean.rate |
| Bgwriter: Requested checkpoints per second | Number of requested checkpoints that have been performed | DEPENDENT | - | pgsql.bgwriter.checkpoints_req.rate |
| Bgwriter: Scheduled checkpoints per second | Number of scheduled checkpoints that have been performed | DEPENDENT | - | pgsql.bgwriter.checkpoints_timed.rate |
| Bgwriter: Checkpoint sync time | Total amount of time that has been spent in the portion of checkpoint processing where files are synchronized to disk | DEPENDENT | - | pgsql.bgwriter.checkpoint_sync_time |
| Bgwriter: Checkpoint write time | Total amount of time that has been spent in the portion of checkpoint processing where files are written to disk, in milliseconds | DEPENDENT | - | pgsql.bgwriter.checkpoint_write_time |
| Bgwriter: Max written per second | Number of times the background writer stopped a cleaning scan because it had written too many buffers | DEPENDENT | - | pgsql.bgwriter.maxwritten_clean.rate |
| PostgreSQL: Get bgwriter | Statistics about the background writer process's activity | - | - | pgsql.bgwriter["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| Status: Cache hit ratio % | Cache hit ratio | - | - | pgsql.cache.hit["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| Status: Config hash | PostgreSQL configuration hash | - | 15m | pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| Connections sum: Active | Total number of connections executing a query | DEPENDENT | - | pgsql.connections.sum.active |
| Connections sum: Idle | Total number of connections waiting for a new client command | DEPENDENT | - | pgsql.connections.sum.idle |
| Connections sum: Idle in transaction | Total number of connections in a transaction state, but not executing a query | DEPENDENT | - | pgsql.connections.sum.idle_in_transaction |
| Connections sum: Prepared | Total number of prepared transactions https://www.postgresql.org/docs/current/sql-prepare-transaction.html | DEPENDENT | - | pgsql.connections.sum.prepared |
| Connections sum: Total | Total number of connections | DEPENDENT | - | pgsql.connections.sum.total |
| Connections sum: Total % | Total number of connections in percentage | DEPENDENT | - | pgsql.connections.sum.total_pct |
| Connections sum: Waiting | Total number of waiting connections https://www.postgresql.org/docs/current/monitoring-stats.html#WAIT-EVENT-TABLE | DEPENDENT | - | pgsql.connections.sum.waiting |
| PostgreSQL: Get connections sum | Collect all metrics from pg_stat_activity https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW | - | - | pgsql.connections.sum["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| PostgreSQL: Get dbstat | Collect all metrics from pg_stat_database per database https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-DATABASE-VIEW | - | - | pgsql.dbstat["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| PostgreSQL: Get locks | Collect all metrics from pg_locks per database https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-TABLES | - | - | pgsql.locks["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| Status: Ping time | - | - | - | pgsql.ping.time["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| Status: Ping | - | - | - | pgsql.ping["{$PG.HOST}","{$PG.PORT}"] |
| PostgreSQL: Get queries | Collect all metrics by query execution time | - | - | pgsql.queries["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}","{$PG.QUERY_ETIME.MAX.WARN}"] |
| Replication: standby count | Number of standby servers | - | - | pgsql.replication.count["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| Replication: lag in seconds | Replication lag with Master in seconds | - | - | pgsql.replication.lag.sec["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| Replication: recovery role | Replication role: 1 — recovery is still in progress (standby mode), 0 — master mode. | - | - | pgsql.replication.recovery_role["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| Replication: status | Replication status: 0 — streaming is down, 1 — streaming is up, 2 — master mode | - | - | pgsql.replication.status["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| Transactions: Max active transaction time | Current max active transaction time | DEPENDENT | - | pgsql.transactions.active |
| Transactions: Max idle transaction time | Current max idle transaction time | DEPENDENT | - | pgsql.transactions.idle |
| Transactions: Max prepared transaction time | Current max prepared transaction time | DEPENDENT | - | pgsql.transactions.prepared |
| Transactions: Max waiting transaction time | Current max waiting transaction time | DEPENDENT | - | pgsql.transactions.waiting |
| PostgreSQL: Get transactions | Collect metrics by transaction execution time | - | - | pgsql.transactions["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| Status: Uptime | - | - | - | pgsql.uptime["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| Status: Version | PostgreSQL version | - | 15m | pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| WAL: Segments count | Number of WAL segments | DEPENDENT | - | pgsql.wal.count |
| PostgreSQL: Get WAL | Master item to collect WAL metrics | - | 5m | pgsql.wal.stat["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
| WAL: Bytes written | WAL write in bytes | DEPENDENT | - | pgsql.wal.write |
Triggers
| Name | Description | Expression | Priority | Dependencies |
|---|---|---|---|---|
| PostgreSQL: Required checkpoints occurs too frequently | Checkpoints are points in the sequence of transactions at which it is guaranteed that the heap and index data files have been updated with all information written before that checkpoint. At checkpoint time, all dirty data pages are flushed to disk and a special checkpoint record is written to the log file. https://www.postgresql.org/docs/current/wal-configuration.html | last(/PostgreSQL by Zabbix agent/pgsql.bgwriter.checkpoints_req.rate) > {$PG.CHECKPOINTS_REQ.MAX.WARN} | AVERAGE ⚠ | Bgwriter: Requested checkpoints per second |
| PostgreSQL: Failed to get items | Zabbix has not received data for items for the last 30 minutes | nodata(/PostgreSQL by Zabbix agent/pgsql.bgwriter["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],30m) = 1 | WARNING 📢 | PostgreSQL: Get bgwriter |
| PostgreSQL: Cache hit ratio too low | - | max(/PostgreSQL by Zabbix agent/pgsql.cache.hit["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m) < {$PG.CACHE_HITRATIO.MIN.WARN} | WARNING 📢 | Status: Cache hit ratio % |
| PostgreSQL: Configuration has changed | - | last(/PostgreSQL by Zabbix agent/pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#1)<>last(/PostgreSQL by Zabbix agent/pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#2) and length(last(/PostgreSQL by Zabbix agent/pgsql.config.hash["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]))>0 | INFO 🔔 | Status: Config hash |
| PostgreSQL: Total number of connections is too high | - | min(/PostgreSQL by Zabbix agent/pgsql.connections.sum.total_pct,5m) > {$PG.CONN_TOTAL_PCT.MAX.WARN} | AVERAGE ⚠ | Connections sum: Total % |
| PostgreSQL: Response too long | - | min(/PostgreSQL by Zabbix agent/pgsql.ping.time["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m) > {$PG.PING_TIME.MAX.WARN} | AVERAGE ⚠ | Status: Ping time |
| PostgreSQL: Service is down | - | last(/PostgreSQL by Zabbix agent/pgsql.ping["{$PG.HOST}","{$PG.PORT}"]) = 0 | HIGH ⛔ | Status: Ping |
| PostgreSQL: Streaming lag with {#MASTER} is too high | - | min(/PostgreSQL by Zabbix agent/pgsql.replication.lag.sec["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m) > {$PG.REPL_LAG.MAX.WARN} | AVERAGE ⚠ | Replication: lag in seconds |
| PostgreSQL: Replication is down | - | max(/PostgreSQL by Zabbix agent/pgsql.replication.status["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],5m)=0 | AVERAGE ⚠ | Replication: status |
| PostgreSQL: Service has been restarted | PostgreSQL uptime is less than 10 minutes | last(/PostgreSQL by Zabbix agent/pgsql.uptime["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]) < 10m | INFO 🔔 | Status: Uptime |
| PostgreSQL: Version has changed | - | last(/PostgreSQL by Zabbix agent/pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#1)<>last(/PostgreSQL by Zabbix agent/pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"],#2) and length(last(/PostgreSQL by Zabbix agent/pgsql.version["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"]))>0 | INFO 🔔 | Status: Version |
Discovery rule №1
| Name | Description | Type | Interval | Key and additional info |
|---|---|---|---|---|
| Database discovery | - | - | 1h | pgsql.discovery.db["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}"] |
Item prototypes
| Name | Description | Type | Interval | Key and additional info |
|---|---|---|---|---|
| DB {#DBNAME}: Database size | Database size | - | 15m | pgsql.db.size["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DB}","{#DBNAME}"] |
| DB {#DBNAME}: Blocks hit per second | Total number of times disk blocks were found already in the buffer cache, so that a read was not necessary | DEPENDENT | - | pgsql.dbstat.blks_hit.rate["{#DBNAME}"] |
| DB {#DBNAME}: Disk blocks read per second | Total number of disk blocks read in this database | DEPENDENT | - | pgsql.dbstat.blks_read.rate["{#DBNAME}"] |
| DB {#DBNAME}: Detected conflicts per second | Total number of queries canceled due to conflicts with recovery in this database | DEPENDENT | - | pgsql.dbstat.conflicts.rate["{#DBNAME}"] |
| DB {#DBNAME}: Detected deadlocks per second | Total number of detected deadlocks in this database | DEPENDENT | - | pgsql.dbstat.deadlocks.rate["{#DBNAME}"] |
| DB {#DBNAME}: Temp_bytes written per second | Total amount of data written to temporary files by queries in this database | DEPENDENT | - | pgsql.dbstat.temp_bytes.rate["{#DBNAME}"] |
| DB {#DBNAME}: Temp_files created per second | Total number of temporary files created by queries in this database | DEPENDENT | - | pgsql.dbstat.temp_files.rate["{#DBNAME}"] |
| DB {#DBNAME}: Tuples deleted per second | Total number of rows deleted by queries in this database | DEPENDENT | - | pgsql.dbstat.tup_deleted.rate["{#DBNAME}"] |
| DB {#DBNAME}: Tuples fetched per second | Total number of rows fetched by queries in this database | DEPENDENT | - | pgsql.dbstat.tup_fetched.rate["{#DBNAME}"] |
| DB {#DBNAME}: Tuples inserted per second | Total number of rows inserted by queries in this database | DEPENDENT | - | pgsql.dbstat.tup_inserted.rate["{#DBNAME}"] |
| DB {#DBNAME}: Tuples returned per second | Total number of rows updated by queries in this database | DEPENDENT | - | pgsql.dbstat.tup_returned.rate["{#DBNAME}"] |
| DB {#DBNAME}: Tuples updated per second | Total number of rows updated by queries in this database | DEPENDENT | - | pgsql.dbstat.tup_updated.rate["{#DBNAME}"] |
| DB {#DBNAME}: Commits per second | Number of transactions in this database that have been committed | DEPENDENT | - | pgsql.dbstat.xact_commit.rate["{#DBNAME}"] |
| DB {#DBNAME}: Rollbacks per second | Total number of transactions in this database that have been rolled back | DEPENDENT | - | pgsql.dbstat.xact_rollback.rate["{#DBNAME}"] |
| DB {#DBNAME}: Frozen XID before avtovacuum % | reventing Transaction ID Wraparound Failures https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND | DEPENDENT | - | pgsql.frozenxid.prc_before_av["{#DBNAME}"] |
| DB {#DBNAME}: Frozen XID before stop % | Preventing Transaction ID Wraparound Failures https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND | DEPENDENT | - | pgsql.frozenxid.prc_before_stop["{#DBNAME}"] |
| DB {#DBNAME}: Get frozen XID | - | - | - | pgsql.frozenxid["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"] |
| DB {#DBNAME}: Locks total | Total number of locks in the database | DEPENDENT | - | pgsql.locks.total["{#DBNAME}"] |
| DB {#DBNAME}: Queries slow maintenance count | Slow maintenance query count | DEPENDENT | - | pgsql.queries.mro.slow_count["{#DBNAME}"] |
| DB {#DBNAME}: Queries max maintenance time | Max maintenance query time | DEPENDENT | - | pgsql.queries.mro.time_max["{#DBNAME}"] |
| DB {#DBNAME}: Queries sum maintenance time | Sum maintenance query time | DEPENDENT | - | pgsql.queries.mro.time_sum["{#DBNAME}"] |
| DB {#DBNAME}: Queries slow query count | Slow query count | DEPENDENT | - | pgsql.queries.query.slow_count["{#DBNAME}"] |
| DB {#DBNAME}: Queries max query time | Max query time | DEPENDENT | - | pgsql.queries.query.time_max["{#DBNAME}"] |
| DB {#DBNAME}: Queries sum query time | Sum query time | DEPENDENT | - | pgsql.queries.query.time_sum["{#DBNAME}"] |
| DB {#DBNAME}: Queries slow transaction count | Slow transaction query count | DEPENDENT | - | pgsql.queries.tx.slow_count["{#DBNAME}"] |
| DB {#DBNAME}: Queries max transaction time | Max transaction query time | DEPENDENT | - | pgsql.queries.tx.time_max["{#DBNAME}"] |
| DB {#DBNAME}: Queries sum transaction time | Sum transaction query time | DEPENDENT | - | pgsql.queries.tx.time_sum["{#DBNAME}"] |
| DB {#DBNAME}: Index scans per second | Number of index scans in the database | DEPENDENT | - | pgsql.scans.idx.rate["{#DBNAME}"] |
| DB {#DBNAME}: Sequential scans per second | Number of sequential scans in the database | DEPENDENT | - | pgsql.scans.seq.rate["{#DBNAME}"] |
| DB {#DBNAME}: Get scans | Number of scans done for table/index in the database | - | - | pgsql.scans["{$PG.HOST}","{$PG.PORT}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"] |
Trigger prototypes
| Name | Description | Expression | Priority | Dependencies |
|---|---|---|---|---|
| DB {#DBNAME}: Too many recovery conflicts | The primary and standby servers are in many ways loosely connected. Actions on the primary will have an effect on the standby. As a result, there is potential for negative interactions or conflicts between them. https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT | min(/PostgreSQL by Zabbix agent/pgsql.dbstat.conflicts.rate["{#DBNAME}"],5m) > {$PG.CONFLICTS.MAX.WARN:"{#DBNAME}"} | AVERAGE ⚠ | DB {#DBNAME}: Detected conflicts per second |
| DB {#DBNAME}: Deadlock occurred | - | min(/PostgreSQL by Zabbix agent/pgsql.dbstat.deadlocks.rate["{#DBNAME}"],5m) > {$PG.DEADLOCKS.MAX.WARN:"{#DBNAME}"} | HIGH ⛔ | DB {#DBNAME}: Detected deadlocks per second |
| DB {#DBNAME}: VACUUM FREEZE is required to prevent wraparound | Preventing Transaction ID Wraparound Failures https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND | last(/PostgreSQL by Zabbix agent/pgsql.frozenxid.prc_before_stop["{#DBNAME}"])<{$PG.FROZENXID_PCT_STOP.MIN.HIGH:"{#DBNAME}"} | AVERAGE ⚠ | DB {#DBNAME}: Frozen XID before stop % |
| DB {#DBNAME}: Number of locks is too high | - | min(/PostgreSQL by Zabbix agent/pgsql.locks.total["{#DBNAME}"],5m)>{$PG.LOCKS.MAX.WARN:"{#DBNAME}"} | WARNING 📢 | DB {#DBNAME}: Locks total |
| DB {#DBNAME}: Too many slow queries | - | min(/PostgreSQL by Zabbix agent/pgsql.queries.query.slow_count["{#DBNAME}"],5m)>{$PG.SLOW_QUERIES.MAX.WARN:"{#DBNAME}"} | WARNING 📢 | DB {#DBNAME}: Queries slow query count |