PostgreSQL by Zabbix agent 2
Macros used
| Name | Value |
|---|---|
| {$PG.CONFLICTS.MAX.WARN} | 0 |
| {$PG.CONN_TOTAL_PCT.MAX.WARN} | 90 |
| {$PG.DATABASE} | postgres |
| {$PG.DEADLOCKS.MAX.WARN} | 0 |
| {$PG.LLD.FILTER.APPLICATION} | (.+) |
| {$PG.LLD.FILTER.DBNAME} | (.+) |
| {$PG.PASSWORD} | postgres |
| {$PG.QUERY_ETIME.MAX.WARN} | 30 |
| {$PG.SLOW_QUERIES.MAX.WARN} | 5 |
| {$PG.URI} | tcp://localhost:5432 |
| {$PG.USER} | postgres |
Items collected
| Name | Description | Type | Interval | Key and additional info |
|---|---|---|---|---|
| Archive: Count of archive files | Collect all metrics from pg_stat_activity https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ARCHIVER-VIEW | DEPENDENT | - | pgsql.archive.count_archived_files |
| Archive: Count of files in archive_status need to archive | - | DEPENDENT | - | pgsql.archive.count_files_to_archive |
| Archive: Count of attempts to archive files | Collect all metrics from pg_stat_activity https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ARCHIVER-VIEW | DEPENDENT | - | pgsql.archive.failed_trying_to_archive |
| Archive: Count of files need to archive | Size of files to archive | DEPENDENT | - | pgsql.archive.size_files_to_archive |
| PostgreSQL: Get archive | Collect archive status metrics | - | - | pgsql.archive["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| Autovacuum: Count of autovacuum workers | Number of autovacuum workers. | - | - | pgsql.autovacuum.count["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| Bgwriter: Buffers allocated | Number of buffers allocated | DEPENDENT | - | pgsql.bgwriter.buffers_alloc.rate |
| Bgwriter: Buffers written directly by a backend | Number of buffers written directly by a backend | DEPENDENT | - | pgsql.bgwriter.buffers_backend.rate |
| Bgwriter: Times a backend execute its own fsync | Number of times a backend had to execute its own fsync call (normally the background writer handles those even when the backend does its own write) | DEPENDENT | - | pgsql.bgwriter.buffers_backend_fsync.rate |
| Checkpoint: Buffers checkpoints written | Number of buffers written during checkpoints | DEPENDENT | - | pgsql.bgwriter.buffers_checkpoint.rate |
| Checkpoint: Buffers background written | Number of buffers written by the background writer | DEPENDENT | - | pgsql.bgwriter.buffers_clean.rate |
| Checkpoint: Requested | Number of requested checkpoints that have been performed | DEPENDENT | - | pgsql.bgwriter.checkpoints_req.rate |
| Checkpoint: By timeout | Number of scheduled checkpoints that have been performed | DEPENDENT | - | pgsql.bgwriter.checkpoints_timed.rate |
| Checkpoint: Checkpoint sync time | Total amount of time that has been spent in the portion of checkpoint processing where files are synchronized to disk | DEPENDENT | - | pgsql.bgwriter.checkpoint_sync_time.rate |
| Checkpoint: Checkpoint write time | Total amount of time that has been spent in the portion of checkpoint processing where files are written to disk, in milliseconds | DEPENDENT | - | pgsql.bgwriter.checkpoint_write_time.rate |
| Bgwriter: Number of bgwriter stopped | Number of times the background writer stopped a cleaning scan because it had written too many buffers | DEPENDENT | - | pgsql.bgwriter.maxwritten_clean.rate |
| PostgreSQL: Get bgwriter | https://www.postgresql.org/docs/12/monitoring-stats.html#PG-STAT-BGWRITER-VIEW | - | - | pgsql.bgwriter["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| PostgreSQL: Cache hit | - | CALCULATED | - | pgsql.cache.hit["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| Connections sum: Active | Total number of connections executing a query | DEPENDENT | - | pgsql.connections.active |
| Connections sum: Disabled | Total number of disabled connections | DEPENDENT | - | pgsql.connections.disabled |
| Connections sum: Fastpath function call | Total number of connections executing a fast-path function | DEPENDENT | - | pgsql.connections.fastpath_function_call |
| Connections sum: Idle | Total number of connections waiting for a new client command | DEPENDENT | - | pgsql.connections.idle |
| Connections sum: Idle in transaction | Total number of connections in a transaction state, but not executing a query | DEPENDENT | - | pgsql.connections.idle_in_transaction |
| Connections sum: Idle in transaction (aborted) | Total number of connections in a transaction state, but not executing a query and one of the statements in the transaction caused an error. | DEPENDENT | - | pgsql.connections.idle_in_transaction_aborted |
| Connections sum: Prepared | Total number of prepared transactions https://www.postgresql.org/docs/current/sql-prepare-transaction.html | DEPENDENT | - | pgsql.connections.prepared |
| Connections sum: Total | Total number of connections | DEPENDENT | - | pgsql.connections.total |
| Connections sum: Total % | Total number of connections in percentage | DEPENDENT | - | pgsql.connections.total_pct |
| Connections sum: Waiting | Total number of waiting connections https://www.postgresql.org/docs/current/monitoring-stats.html#WAIT-EVENT-TABLE | DEPENDENT | - | pgsql.connections.waiting |
| PostgreSQL: Get connections | Collect all metrics from pg_stat_activity https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW | - | - | pgsql.connections["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| PostgreSQL: Custom queries | Execute custom queries from file *.sql (check for option Plugins.Postgres.CustomQueriesPath at agent configuration) | - | - | pgsql.custom.query["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DATABASE}",""] |
| Dbstat: Hit blocks read | Number of times disk blocks were found already in the buffer cache | DEPENDENT | - | pgsql.dbstat.sum.blks_hit.rate |
| Dbstat: Disk blocks read | Number of disk blocks read | DEPENDENT | - | pgsql.dbstat.sum.blks_read.rate |
| Dbstat: Blocks read time | Time spent reading data file blocks by backends, in milliseconds | DEPENDENT | - | pgsql.dbstat.sum.blk_read_time |
| Dbstat: Blocks write time | Time spent writing data file blocks by backends, in milliseconds | DEPENDENT | - | pgsql.dbstat.sum.blk_write_time |
| Dbstat: Checksum failures | Number of data page checksum failures detected (or on a shared object), or NULL if data checksums are not enabled. This metric included in PostgreSQL 12 | DEPENDENT | - | pgsql.dbstat.sum.checksum_failures.rate |
| Dbstat: Conflicts | Number of queries canceled due to conflicts with recovery. (Conflicts occur only on standby servers; see pg_stat_database_conflicts for details.) | DEPENDENT | - | pgsql.dbstat.sum.conflicts.rate |
| Dbstat: Deadlocks | Number of deadlocks detected | DEPENDENT | - | pgsql.dbstat.sum.deadlocks.rate |
| Dbstat: Backends connected | Number of connected backends | DEPENDENT | - | pgsql.dbstat.sum.numbackends |
| Dbstat: Number temp bytes | Total amount of data written to temporary files by queries | DEPENDENT | - | pgsql.dbstat.sum.temp_bytes.rate |
| Dbstat: Number temp bytes | Number of temporary files created by queries | DEPENDENT | - | pgsql.dbstat.sum.temp_files.rate |
| Dbstat: Rows deleted | Number of rows deleted by queries | DEPENDENT | - | pgsql.dbstat.sum.tup_deleted.rate |
| Dbstat: Rows fetched | Number of rows fetched by queries | DEPENDENT | - | pgsql.dbstat.sum.tup_fetched.rate |
| Dbstat: Rows inserted | Number of rows inserted by queries | DEPENDENT | - | pgsql.dbstat.sum.tup_inserted.rate |
| Dbstat: Rows returned | Number of rows returned by queries | DEPENDENT | - | pgsql.dbstat.sum.tup_returned.rate |
| Dbstat: Rows updated | Number of rows updated by queries | DEPENDENT | - | pgsql.dbstat.sum.tup_updated.rate |
| Dbstat: Committed transactions | Number of transactions that have been committed | DEPENDENT | - | pgsql.dbstat.sum.xact_commit.rate |
| Dbstat: Roll backed transactions | Number of transactions that have been rolled back | DEPENDENT | - | pgsql.dbstat.sum.xact_rollback.rate |
| PostgreSQL: Get dbstat sum | Collect all metrics from pg_stat_database per database https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-DATABASE-VIEW | - | - | pgsql.dbstat.sum["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| PostgreSQL: Get dbstat | Collect all metrics from pg_stat_database per database https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-DATABASE-VIEW | - | - | pgsql.dbstat["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| PostgreSQL: Get locks | Collect all metrics from pg_locks per database https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-TABLES | - | - | pgsql.locks["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| PostgreSQL: Age of oldest xid | Age of oldest xid. | - | - | pgsql.oldest.xid["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| PostgreSQL: Ping | - | - | - | pgsql.ping["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| PostgreSQL: Get queries | Collect all metrics by query execution time | - | - | pgsql.queries["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}","{$PG.DATABASE}","{$PG.QUERY_ETIME.MAX.WARN}"] |
| Replication: Standby count | Number of standby servers | - | - | pgsql.replication.count["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| Replication: Lag in bytes | Replication lag with Master in byte. | - | - | pgsql.replication.lag.b["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| Replication: Lag in seconds | Replication lag with Master in seconds. | - | - | pgsql.replication.lag.sec["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| PostgreSQL: Get replication | Collect metrics from the pg_stat_replication, which contains information about the WAL sender process, showing statistics about replication to that sender's connected standby server. | - | - | pgsql.replication.process["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| Replication: Recovery role | Replication role: 1 — recovery is still in progress (standby mode), 0 — master mode. | - | - | pgsql.replication.recovery_role["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| Replication: Status | Replication status: 0 — streaming is down, 1 — streaming is up, 2 — master mode | - | - | pgsql.replication.status["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| PostgreSQL: Uptime | - | - | - | pgsql.uptime["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| WAL: Segments count | Number of WAL segments | DEPENDENT | - | pgsql.wal.count |
| WAL: Bytes received | WAL receive in bytes | DEPENDENT | - | pgsql.wal.receive |
| PostgreSQL: Get WAL | Collect WAL metrics | - | 5m | pgsql.wal.stat["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
| WAL: Bytes written | WAL write in bytes | DEPENDENT | - | pgsql.wal.write |
Triggers
| Name | Description | Expression | Priority | Dependencies |
|---|---|---|---|---|
| Connections sum: Total number of connections is too high | - | min(/PostgreSQL by Zabbix agent 2/pgsql.connections.total_pct,5m) > {$PG.CONN_TOTAL_PCT.MAX.WARN} | AVERAGE ⚠ | Connections sum: Total % |
| PostgreSQL: Oldest xid is too big | - | last(/PostgreSQL by Zabbix agent 2/pgsql.oldest.xid["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]) > 18000000 | AVERAGE ⚠ | PostgreSQL: Age of oldest xid |
| PostgreSQL: Service is down | - | last(/PostgreSQL by Zabbix agent 2/pgsql.ping["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"])=0 | HIGH ⛔ | PostgreSQL: Ping |
| PostgreSQL: Service has been restarted | - | last(/PostgreSQL by Zabbix agent 2/pgsql.uptime["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"]) < 600 | AVERAGE ⚠ | PostgreSQL: Uptime |
Discovery rule №1
| Name | Description | Type | Interval | Key and additional info |
|---|---|---|---|---|
| Database discovery | - | - | 1h | pgsql.db.discovery["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
Item prototypes
| Name | Description | Type | Interval | Key and additional info |
|---|---|---|---|---|
| DB {#DBNAME}: Database age | Database age | - | 10m | pgsql.db.age["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"] |
| DB {#DBNAME}: Get bloating tables | Number of bloating tables | - | - | pgsql.db.bloating_tables["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"] |
| DB {#DBNAME}: Database size | Database size | - | 5m | pgsql.db.size["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}","{#DBNAME}"] |
| DB {#DBNAME}: Blocks hit per second | Total number of times disk blocks were found already in the buffer cache, so that a read was not necessary | DEPENDENT | - | pgsql.dbstat.blks_hit.rate["{#DBNAME}"] |
| DB {#DBNAME}: Disk blocks read per second | Total number of disk blocks read in this database | DEPENDENT | - | pgsql.dbstat.blks_read.rate["{#DBNAME}"] |
| DB {#DBNAME}: Disk blocks read time | Time spent reading data file blocks by backends, in milliseconds | DEPENDENT | - | pgsql.dbstat.blk_read_time.rate["{#DBNAME}"] |
| DB {#DBNAME}: Disk blocks write time | Time spent writing data file blocks by backends, in milliseconds | DEPENDENT | - | pgsql.dbstat.blk_write_time.rate["{#DBNAME}"] |
| DB {#DBNAME}: Checksum failures | Number of data page checksum failures detected in this database | DEPENDENT | - | pgsql.dbstat.checksum_failures.rate["{#DBNAME}"] |
| DB {#DBNAME}: Detected conflicts per second | Total number of queries canceled due to conflicts with recovery in this database | DEPENDENT | - | pgsql.dbstat.conflicts.rate["{#DBNAME}"] |
| DB {#DBNAME}: Detected deadlocks per second | Total number of detected deadlocks in this database | DEPENDENT | - | pgsql.dbstat.deadlocks.rate["{#DBNAME}"] |
| DB {#DBNAME}: Backends connected | Number of backends currently connected to this database | DEPENDENT | - | pgsql.dbstat.numbackends["{#DBNAME}"] |
| DB {#DBNAME}: Temp_bytes written per second | Total amount of data written to temporary files by queries in this database | DEPENDENT | - | pgsql.dbstat.temp_bytes.rate["{#DBNAME}"] |
| DB {#DBNAME}: Temp_files created per second | Total number of temporary files created by queries in this database | DEPENDENT | - | pgsql.dbstat.temp_files.rate["{#DBNAME}"] |
| DB {#DBNAME}: Tuples deleted per second | Total number of rows deleted by queries in this database | DEPENDENT | - | pgsql.dbstat.tup_deleted.rate["{#DBNAME}"] |
| DB {#DBNAME}: Tuples fetched per second | Total number of rows fetched by queries in this database | DEPENDENT | - | pgsql.dbstat.tup_fetched.rate["{#DBNAME}"] |
| DB {#DBNAME}: Tuples inserted per second | Total number of rows inserted by queries in this database | DEPENDENT | - | pgsql.dbstat.tup_inserted.rate["{#DBNAME}"] |
| DB {#DBNAME}: Tuples returned per second | Number of rows returned by queries in this database | DEPENDENT | - | pgsql.dbstat.tup_returned.rate["{#DBNAME}"] |
| DB {#DBNAME}: Tuples updated per second | Total number of rows updated by queries in this database | DEPENDENT | - | pgsql.dbstat.tup_updated.rate["{#DBNAME}"] |
| DB {#DBNAME}: Commits per second | Number of transactions in this database that have been committed | DEPENDENT | - | pgsql.dbstat.xact_commit.rate["{#DBNAME}"] |
| DB {#DBNAME}: Rollbacks per second | Total number of transactions in this database that have been rolled back | DEPENDENT | - | pgsql.dbstat.xact_rollback.rate["{#DBNAME}"] |
| DB {#DBNAME}: Num of accessexclusive locks | Number of accessexclusive locks for each database | DEPENDENT | - | pgsql.locks.accessexclusive["{#DBNAME}"] |
| DB {#DBNAME}: Num of accessshare locks | Number of accessshare locks for each database | DEPENDENT | - | pgsql.locks.accessshare["{#DBNAME}"] |
| DB {#DBNAME}: Num of exclusive locks | Number of exclusive locks for each database | DEPENDENT | - | pgsql.locks.exclusive["{#DBNAME}"] |
| DB {#DBNAME}: Num of rowexclusive locks | Number of rowexclusive locks for each database | DEPENDENT | - | pgsql.locks.rowexclusive["{#DBNAME}"] |
| DB {#DBNAME}: Num of rowshare locks | Number of rowshare locks for each database | DEPENDENT | - | pgsql.locks.rowshare["{#DBNAME}"] |
| DB {#DBNAME}: Num of sharerowexclusive locks | Number of total sharerowexclusive for each database | DEPENDENT | - | pgsql.locks.sharerowexclusive["{#DBNAME}"] |
| DB {#DBNAME}: Num of shareupdateexclusive locks | Number of shareupdateexclusive locks for each database | DEPENDENT | - | pgsql.locks.shareupdateexclusive["{#DBNAME}"] |
| DB {#DBNAME}: Num of share locks | Number of share locks for each database | DEPENDENT | - | pgsql.locks.share["{#DBNAME}"] |
| DB {#DBNAME}: Num of total locks | Number of total locks for each database | DEPENDENT | - | pgsql.locks.total["{#DBNAME}"] |
| DB {#DBNAME}: Queries slow maintenance count | Slow maintenance query count | DEPENDENT | - | pgsql.queries.mro.slow_count["{#DBNAME}"] |
| DB {#DBNAME}: Queries max maintenance time | Max maintenance query time | DEPENDENT | - | pgsql.queries.mro.time_max["{#DBNAME}"] |
| DB {#DBNAME}: Queries sum maintenance time | Sum maintenance query time | DEPENDENT | - | pgsql.queries.mro.time_sum["{#DBNAME}"] |
| DB {#DBNAME}: Queries slow query count | Slow query count | DEPENDENT | - | pgsql.queries.query.slow_count["{#DBNAME}"] |
| DB {#DBNAME}: Queries max query time | Max query time | DEPENDENT | - | pgsql.queries.query.time_max["{#DBNAME}"] |
| DB {#DBNAME}: Queries sum query time | Sum query time | DEPENDENT | - | pgsql.queries.query.time_sum["{#DBNAME}"] |
| DB {#DBNAME}: Queries slow transaction count | Slow transaction query count | DEPENDENT | - | pgsql.queries.tx.slow_count["{#DBNAME}"] |
| DB {#DBNAME}: Queries max transaction time | Max transaction query time | DEPENDENT | - | pgsql.queries.tx.time_max["{#DBNAME}"] |
| DB {#DBNAME}: Queries sum transaction time | Sum transaction query time | DEPENDENT | - | pgsql.queries.tx.time_sum["{#DBNAME}"] |
Trigger prototypes
| Name | Description | Expression | Priority | Dependencies |
|---|---|---|---|---|
| DB {#DBNAME}: Too many recovery conflicts | The primary and standby servers are in many ways loosely connected. Actions on the primary will have an effect on the standby. As a result, there is potential for negative interactions or conflicts between them. https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT | min(/PostgreSQL by Zabbix agent 2/pgsql.dbstat.conflicts.rate["{#DBNAME}"],5m) > {$PG.CONFLICTS.MAX.WARN:"{#DBNAME}"} | AVERAGE ⚠ | DB {#DBNAME}: Detected conflicts per second |
| DB {#DBNAME}: Deadlock occurred | - | min(/PostgreSQL by Zabbix agent 2/pgsql.dbstat.deadlocks.rate["{#DBNAME}"],5m) > {$PG.DEADLOCKS.MAX.WARN:"{#DBNAME}"} | HIGH ⛔ | DB {#DBNAME}: Detected deadlocks per second |
| DB {#DBNAME}: Too many slow queries | - | min(/PostgreSQL by Zabbix agent 2/pgsql.queries.query.slow_count["{#DBNAME}"],5m)>{$PG.SLOW_QUERIES.MAX.WARN:"{#DBNAME}"} | WARNING 📢 | DB {#DBNAME}: Queries slow query count |
Discovery rule №2
| Name | Description | Type | Interval | Key and additional info |
|---|---|---|---|---|
| Replication Discovery | - | - | 15m | pgsql.replication.process.discovery["{$PG.URI}","{$PG.USER}","{$PG.PASSWORD}"] |
Item prototypes
| Name | Description | Type | Interval | Key and additional info |
|---|---|---|---|---|
| Application {#APPLICATION_NAME}: Replication flush lag | - | DEPENDENT | - | pgsql.replication.process.flush_lag["{#APPLICATION_NAME}"] |
| Application {#APPLICATION_NAME}: Replication replay lag | - | DEPENDENT | - | pgsql.replication.process.replay_lag["{#APPLICATION_NAME}"] |
| Application {#APPLICATION_NAME}: Replication write lag | - | DEPENDENT | - | pgsql.replication.process.write_lag["{#APPLICATION_NAME}"] |