CockroachDB by HTTP
Macros used
| Name | Value |
|---|---|
| {$COCKROACHDB.API.PORT} | 8080 |
| {$COCKROACHDB.API.SCHEME} | http |
| {$COCKROACHDB.CERT.CA.EXPIRY.WARN} | 90 |
| {$COCKROACHDB.CERT.NODE.EXPIRY.WARN} | 30 |
| {$COCKROACHDB.CLOCK.OFFSET.MAX.WARN} | 300 |
| {$COCKROACHDB.OPEN.FDS.MAX.WARN} | 80 |
| {$COCKROACHDB.STATEMENTS.ERRORS.MAX.WARN} | 2 |
| {$COCKROACHDB.STORE.USED.MIN.CRIT} | 10 |
| {$COCKROACHDB.STORE.USED.MIN.WARN} | 20 |
Items collected
| Name | Description | Type | Interval | Key and additional info |
|---|---|---|---|---|
| CockroachDB: CA certificate expiration date | CA certificate expires at that date. | DEPENDENT | - | cockroachdb.cert.expire_date.ca |
| CockroachDB: Node certificate expiration date | Node certificate expires at that date. | DEPENDENT | - | cockroachdb.cert.expire_date.node |
| CockroachDB: Clock offset | Mean clock offset of the node against the rest of the cluster. | DEPENDENT | - | cockroachdb.clock.offset |
| CockroachDB: CPU: System time | System CPU time. | DEPENDENT | - | cockroachdb.cpu.system_time |
| CockroachDB: CPU: User time | User CPU time. | DEPENDENT | - | cockroachdb.cpu.user_time |
| CockroachDB: CPU: Utilization | CPU utilization in %. | DEPENDENT | - | cockroachdb.cpu.util |
| CockroachDB: File descriptors: Limit | Open file descriptors soft limit of the process. | DEPENDENT | - | cockroachdb.descriptors.limit |
| CockroachDB: File descriptors: Open | The number of open file descriptors. | DEPENDENT | - | cockroachdb.descriptors.open |
| CockroachDB: Disk: IOPS in progress, rate | Number of disk IO operations currently in progress on this host. | DEPENDENT | - | cockroachdb.disk.iops.in_progress.rate |
| CockroachDB: Disk: Read IOPS, rate | Number of disk read operations per second across all disks since this process started. | DEPENDENT | - | cockroachdb.disk.iops.read.rate |
| CockroachDB: Disk: Write IOPS, rate | Disk write operations per second across all disks since this process started. | DEPENDENT | - | cockroachdb.disk.iops.write.rate |
| CockroachDB: Disk: Reads, rate | Bytes read from all disks per second since this process started | DEPENDENT | - | cockroachdb.disk.read.rate |
| CockroachDB: Disk: Writes, rate | Bytes written to all disks per second since this process started. | DEPENDENT | - | cockroachdb.disk.write.rate |
| CockroachDB: GC: Pause time | The amount of processor time used by Go's garbage collector across all nodes. During garbage collection, application code execution is paused. | DEPENDENT | - | cockroachdb.gc.pause_time |
| CockroachDB: GC: Runs, rate | The number of times that Go's garbage collector was invoked per second across all nodes. | DEPENDENT | - | cockroachdb.gc.runs.rate |
| CockroachDB: Get health | Get node /health endpoint | HTTP_AGENT | - | cockroachdb.get_health |
| CockroachDB: Get metrics | Get raw metrics from the Prometheus endpoint. | HTTP_AGENT | - | cockroachdb.get_metrics |
| CockroachDB: Get readiness | Get node /health?ready=1 endpoint | HTTP_AGENT | - | cockroachdb.get_readiness |
| CockroachDB: Go: Goroutines count | Current number of Goroutines. This count should rise and fall based on load. | DEPENDENT | - | cockroachdb.go.goroutines.count |
| CockroachDB: Liveness heartbeats, rate | Number of successful node liveness heartbeats per second from this node. | DEPENDENT | - | cockroachdb.heartbeaths.success.rate |
| CockroachDB: KV transactions: Aborted, rate | Number of aborted KV transactions per second. | DEPENDENT | - | cockroachdb.kv.transactions.aborted.rate |
| CockroachDB: KV transactions: Committed, rate | Number of KV transactions (including 1PC) committed per second. | DEPENDENT | - | cockroachdb.kv.transactions.committed.rate |
| CockroachDB: Live nodes count | The number of live nodes in the cluster (will be 0 if this node is not itself live). | DEPENDENT | - | cockroachdb.live_count |
| CockroachDB: Memory: Allocated by Cgo | Current bytes of memory allocated by the C layer. | DEPENDENT | - | cockroachdb.memory.cgo.allocated |
| CockroachDB: Memory: Managed by Cgo | Total bytes of memory managed by the C layer. | DEPENDENT | - | cockroachdb.memory.cgo.managed |
| CockroachDB: Memory: Allocated by Go | Current bytes of memory allocated by the Go layer. | DEPENDENT | - | cockroachdb.memory.go.allocated |
| CockroachDB: Memory: Managed by Go | Total bytes of memory managed by the Go layer. | DEPENDENT | - | cockroachdb.memory.go.managed |
| CockroachDB: Memory: Allocated by SQL | Current SQL statement memory usage for root. | DEPENDENT | - | cockroachdb.memory.sql |
| CockroachDB: Memory: Total usage | Resident set size (RSS) of memory in use by the node. | DEPENDENT | - | cockroachdb.memory.total |
| CockroachDB: Network: Bytes received, rate | Bytes received per second on all network interfaces since this process started. | DEPENDENT | - | cockroachdb.network.bytes.received.rate |
| CockroachDB: Network: Bytes sent, rate | Bytes sent per second on all network interfaces since this process started. | DEPENDENT | - | cockroachdb.network.bytes.sent.rate |
| CockroachDB: Slow requests: DistSender RPCs | Number of RPCs stuck or retrying for a long time. | DEPENDENT | - | cockroachdb.slow_requests.rpc |
| CockroachDB: SQL: Bytes received, rate | Total amount of incoming SQL client network traffic in bytes per second. | DEPENDENT | - | cockroachdb.sql.bytes.received.rate |
| CockroachDB: SQL: Bytes sent, rate | Total amount of outgoing SQL client network traffic in bytes per second. | DEPENDENT | - | cockroachdb.sql.bytes.sent.rate |
| CockroachDB: SQL: Schema changes, rate | Total number of SQL DDL statements successfully executed per second. | DEPENDENT | - | cockroachdb.sql.schema_changes.rate |
| CockroachDB: SQL sessions: Open | Total number of open SQL sessions. | DEPENDENT | - | cockroachdb.sql.sessions |
| CockroachDB: SQL statements: Active | Total number of SQL statements currently active. | DEPENDENT | - | cockroachdb.sql.statements.active |
| CockroachDB: SQL statements: Contention, rate | Total number of SQL statements that experienced contention per second. | DEPENDENT | - | cockroachdb.sql.statements.contention.rate |
| CockroachDB: SQL statements: DELETE, rate | A moving average of the number of DELETE statements successfully executed per second. | DEPENDENT | - | cockroachdb.sql.statements.delete.rate |
| CockroachDB: SQL statements: Denials, rate | The number of statements denied per second by a feature flag. | DEPENDENT | - | cockroachdb.sql.statements.denials.rate |
| CockroachDB: SQL statements: Errors, rate | Total number of statements which returned a planning or runtime error per second. | DEPENDENT | - | cockroachdb.sql.statements.errors.rate |
| CockroachDB: SQL statements: Executed, rate | Number of SQL queries executed per second. | DEPENDENT | - | cockroachdb.sql.statements.executed.rate |
| CockroachDB: SQL statements: Active flows distributed, rate | The number of distributed SQL flows currently active per second. | DEPENDENT | - | cockroachdb.sql.statements.flows.active.rate |
| CockroachDB: SQL statements: INSERT, rate | A moving average of the number of INSERT statements successfully executed per second. | DEPENDENT | - | cockroachdb.sql.statements.insert.rate |
| CockroachDB: SQL statements: SELECT, rate | A moving average of the number of SELECT statements successfully executed per second. | DEPENDENT | - | cockroachdb.sql.statements.select.rate |
| CockroachDB: SQL statements: UPDATE, rate | A moving average of the number of UPDATE statements successfully executed per second. | DEPENDENT | - | cockroachdb.sql.statements.update.rate |
| CockroachDB: SQL transactions: Aborted, rate | Total number of SQL transaction abort errors per second. | DEPENDENT | - | cockroachdb.sql.transactions.aborted.rate |
| CockroachDB: SQL transactions: Committed, rate | Total number of SQL transaction COMMIT statements successfully executed per second. | DEPENDENT | - | cockroachdb.sql.transactions.committed.rate |
| CockroachDB: SQL transactions: Initiated, rate | Total number of SQL transaction BEGIN statements successfully executed per second. | DEPENDENT | - | cockroachdb.sql.transactions.initiated.rate |
| CockroachDB: SQL transactions: Open | Total number of currently open SQL transactions. | DEPENDENT | - | cockroachdb.sql.transactions.open |
| CockroachDB: SQL transactions: Rolled back, rate | Total number of SQL transaction ROLLBACK statements successfully executed per second. | DEPENDENT | - | cockroachdb.sql.transactions.rollbacks.rate |
| CockroachDB: Time series: Sample errors, rate | The number of errors encountered while attempting to write metrics to disk, per second. | DEPENDENT | - | cockroachdb.ts.samples.errors.rate |
| CockroachDB: Time series: Samples written, rate | The number of successfully written metric samples per second. | DEPENDENT | - | cockroachdb.ts.samples.written.rate |
| CockroachDB: Uptime | Process uptime. | DEPENDENT | - | cockroachdb.uptime |
| CockroachDB: Version | Build information. | DEPENDENT | - | cockroachdb.version |
| CockroachDB: Service ping | Check if HTTP/HTTPS service accepts TCP connections. | SIMPLE | - | net.tcp.service["{$COCKROACHDB.API.SCHEME}","{HOST.CONN}","{$COCKROACHDB.API.PORT}"] |
Triggers
| Name | Description | Expression | Priority | Dependencies |
|---|---|---|---|---|
| CockroachDB: CA certificate expires soon | CA certificate expires soon. | (last(/CockroachDB by HTTP/cockroachdb.cert.expire_date.ca) - now()) / 86400 < {$COCKROACHDB.CERT.CA.EXPIRY.WARN} | WARNING 📢 | CockroachDB: CA certificate expiration date |
| CockroachDB: Node certificate expires soon | Node certificate expires soon. | (last(/CockroachDB by HTTP/cockroachdb.cert.expire_date.node) - now()) / 86400 < {$COCKROACHDB.CERT.NODE.EXPIRY.WARN} | WARNING 📢 | CockroachDB: Node certificate expiration date |
| CockroachDB: Clock offset is too high | Cockroach-measured clock offset is nearing limit (by default, servers kill themselves at 400ms from the mean). | min(/CockroachDB by HTTP/cockroachdb.clock.offset,5m) > {$COCKROACHDB.CLOCK.OFFSET.MAX.WARN} * 0.001 | WARNING 📢 | CockroachDB: Clock offset |
| CockroachDB: Node is unhealthy | Node's /health endpoint has returned HTTP 500 Internal Server Error which indicates unhealthy mode. | last(/CockroachDB by HTTP/cockroachdb.get_health) = 500 | AVERAGE ⚠ | CockroachDB: Get health |
| CockroachDB: SQL statements errors rate is too high | - | min(/CockroachDB by HTTP/cockroachdb.sql.statements.errors.rate,5m) > {$COCKROACHDB.STATEMENTS.ERRORS.MAX.WARN} | WARNING 📢 | CockroachDB: SQL statements: Errors, rate |
| CockroachDB: Failed to fetch node data | Zabbix has not received data for items for the last 5 minutes. | nodata(/CockroachDB by HTTP/cockroachdb.uptime,5m) = 1 | WARNING 📢 | CockroachDB: Uptime |
| CockroachDB: Node has been restarted | Uptime is less than 10 minutes. | last(/CockroachDB by HTTP/cockroachdb.uptime) < 10m | INFO 🔔 | CockroachDB: Uptime |
| CockroachDB: Version has changed | - | last(/CockroachDB by HTTP/cockroachdb.version) <> last(/CockroachDB by HTTP/cockroachdb.version,#2) and length(last(/CockroachDB by HTTP/cockroachdb.version)) > 0 | INFO 🔔 | CockroachDB: Version |
| CockroachDB: Service is down | - | last(/CockroachDB by HTTP/net.tcp.service["{$COCKROACHDB.API.SCHEME}","{HOST.CONN}","{$COCKROACHDB.API.PORT}"]) = 0 | AVERAGE ⚠ | CockroachDB: Service ping |
Discovery rule №1
| Name | Description | Type | Interval | Key and additional info |
|---|---|---|---|---|
| Storage metrics discovery | Discover per store metrics. | DEPENDENT | 0 | cockroachdb.store.discovery |
Item prototypes
| Name | Description | Type | Interval | Key and additional info |
|---|---|---|---|---|
| CockroachDB: Storage [{#STORE}]: Queue processing failures: Consistency, rate | Number of replicas which failed processing in the consistency checker queue per second. | DEPENDENT | - | cockroachdb.queue.processing_failures.consistency.[{#STORE},rate] |
| CockroachDB: Storage [{#STORE}]: Queue processing failures: GC, rate | Number of replicas which failed processing in the GC queue per second. | DEPENDENT | - | cockroachdb.queue.processing_failures.gc.[{#STORE},rate] |
| CockroachDB: Storage [{#STORE}]: Queue processing failures: Replica GC, rate | Number of replicas which failed processing in the replica GC queue per second. | DEPENDENT | - | cockroachdb.queue.processing_failures.gc_replica.[{#STORE},rate] |
| CockroachDB: Storage [{#STORE}]: Queue processing failures: Raft log, rate | Number of replicas which failed processing in the Raft log queue per second. | DEPENDENT | - | cockroachdb.queue.processing_failures.raftlog.[{#STORE},rate] |
| CockroachDB: Storage [{#STORE}]: Queue processing failures: Raft snapshot, rate | Number of replicas which failed processing in the Raft repair queue per second. | DEPENDENT | - | cockroachdb.queue.processing_failures.raftsnapshot.[{#STORE},rate] |
| CockroachDB: Storage [{#STORE}]: Queue processing failures: Replicate, rate | Number of replicas which failed processing in the replicate queue per second. | DEPENDENT | - | cockroachdb.queue.processing_failures.replicate.[{#STORE},rate] |
| CockroachDB: Storage [{#STORE}]: Queue processing failures: Split, rate | Number of replicas which failed processing in the split queue per second. | DEPENDENT | - | cockroachdb.queue.processing_failures.split.[{#STORE},rate] |
| CockroachDB: Storage [{#STORE}]: Queue processing failures: Time series maintenance, rate | Number of replicas which failed processing in the time series maintenance queue per second. | DEPENDENT | - | cockroachdb.queue.processing_failures.tsmaintenance.[{#STORE},rate] |
| CockroachDB: Storage [{#STORE}]: Ranges count | Number of ranges. | DEPENDENT | - | cockroachdb.ranges.[{#STORE},count] |
| CockroachDB: Storage [{#STORE}]: Ranges unavailable | Number of ranges with fewer live replicas than needed for quorum. | DEPENDENT | - | cockroachdb.ranges.[{#STORE},unavailable] |
| CockroachDB: Storage [{#STORE}]: Ranges underreplicated | Number of ranges with fewer live replicas than the replication target. | DEPENDENT | - | cockroachdb.ranges.[{#STORE},underreplicated] |
| CockroachDB: Storage [{#STORE}]: Rebalancing: Average queries, rate | Number of kv-level requests received per second by the store, averaged over a large time period as used in rebalancing decisions. | DEPENDENT | - | cockroachdb.rebalancing.queries.average.[{#STORE},rate] |
| CockroachDB: Storage [{#STORE}]: Rebalancing: Average writes, rate | Number of keys written (i.e. applied by raft) per second to the store, averaged over a large time period as used in rebalancing decisions. | DEPENDENT | - | cockroachdb.rebalancing.writes.average.[{#STORE},rate] |
| CockroachDB: Storage [{#STORE}]: Replication: Replicas | Number of replicas. | DEPENDENT | - | cockroachdb.replication.replicas.[{#STORE},count] |
| CockroachDB: Storage [{#STORE}]: Replication: Replicas quiesced | Number of quiesced replicas. | DEPENDENT | - | cockroachdb.replication.replicas.[{#STORE},quiesced] |
| CockroachDB: Storage [{#STORE}]: Replication: Lease holders | Number of lease holders. | DEPENDENT | - | cockroachdb.replication.[{#STORE},lease_holders] |
| CockroachDB: Storage [{#STORE}]: RocksDB cache hits, rate | Count of block cache hits per second. | DEPENDENT | - | cockroachdb.rocksdb.cache.hits.[{#STORE},rate] |
| CockroachDB: Storage [{#STORE}]: RocksDB cache misses, rate | Count of block cache misses per second. | DEPENDENT | - | cockroachdb.rocksdb.cache.misses.[{#STORE},rate] |
| CockroachDB: Storage [{#STORE}]: RocksDB cache hit ratio | Block cache hit ratio in %. | CALCULATED | - | cockroachdb.rocksdb.cache.[{#STORE},hit_ratio] |
| CockroachDB: Storage [{#STORE}]: RocksDB read amplification | The average number of real read operations executed per logical read operation. | DEPENDENT | - | cockroachdb.rocksdb.[{#STORE},read_amp] |
| CockroachDB: Storage [{#STORE}]: RocksDB SSTables | The number of SSTables in use. | DEPENDENT | - | cockroachdb.rocksdb.[{#STORE},sstables] |
| CockroachDB: Storage [{#STORE}]: Slow requests: Latch acquisitions | Number of requests that have been stuck for a long time acquiring latches. | DEPENDENT | - | cockroachdb.slow_requests.[{#STORE},latch_acquisitions] |
| CockroachDB: Storage [{#STORE}]: Slow requests: Lease acquisitions | Number of requests that have been stuck for a long time acquiring a lease. | DEPENDENT | - | cockroachdb.slow_requests.[{#STORE},lease_acquisitions] |
| CockroachDB: Storage [{#STORE}]: Slow requests: Raft proposals | Number of requests that have been stuck for a long time in raft. | DEPENDENT | - | cockroachdb.slow_requests.[{#STORE},raft_proposals] |
| CockroachDB: Storage [{#STORE}]: Bytes: Live | Number of logical bytes stored in live key-value pairs on this node. Live data excludes historical and deleted data. | DEPENDENT | - | cockroachdb.storage.bytes.[{#STORE},live] |
| CockroachDB: Storage [{#STORE}]: Bytes: Logical | Number of logical bytes stored in key-value pairs on this node. This includes historical and deleted data. | DEPENDENT | - | cockroachdb.storage.bytes.[{#STORE},logical] |
| CockroachDB: Storage [{#STORE}]: Bytes: System | Number of physical bytes stored in system key-value pairs. | DEPENDENT | - | cockroachdb.storage.bytes.[{#STORE},system] |
| CockroachDB: Storage [{#STORE}]: Capacity available | Available storage capacity. | DEPENDENT | - | cockroachdb.storage.capacity.[{#STORE},available] |
| CockroachDB: Storage [{#STORE}]: Capacity available in % | Available storage capacity in %. | CALCULATED | - | cockroachdb.storage.capacity.[{#STORE},available_percent] |
| CockroachDB: Storage [{#STORE}]: Capacity total | Total storage capacity. This value may be explicitly set using --store. If a store size has not been set, this metric displays the actual disk capacity. | DEPENDENT | - | cockroachdb.storage.capacity.[{#STORE},total] |
| CockroachDB: Storage [{#STORE}]: Capacity used | Disk space in use by CockroachDB data on this node. This excludes the Cockroach binary, operating system, and other system files. | DEPENDENT | - | cockroachdb.storage.capacity.[{#STORE},used] |
Trigger prototypes
| Name | Description | Expression | Priority | Dependencies |
|---|---|---|---|---|
| CockroachDB: Storage [{#STORE}]: Available storage capacity is critically low | Storage is running critically low on free space (less than {$COCKROACHDB.STORE.USED.MIN.CRIT}% available). | max(/CockroachDB by HTTP/cockroachdb.storage.capacity.[{#STORE},available_percent],5m) < {$COCKROACHDB.STORE.USED.MIN.CRIT} | AVERAGE ⚠ | CockroachDB: Storage [{#STORE}]: Capacity available in % |
| CockroachDB: Storage [{#STORE}]: Available storage capacity is low | Storage is running low on free space (less than {$COCKROACHDB.STORE.USED.MIN.WARN}% available). | max(/CockroachDB by HTTP/cockroachdb.storage.capacity.[{#STORE},available_percent],5m) < {$COCKROACHDB.STORE.USED.MIN.WARN} | WARNING 📢 | CockroachDB: Storage [{#STORE}]: Capacity available in % |