Перейти к основному содержимому

Kubernetes API server by HTTP

Macros used

NameValue
{$KUBE.API.CERT.EXPIRATION}7
{$KUBE.API.HTTP.CLIENT.ERROR}2
{$KUBE.API.HTTP.SERVER.ERROR}2
{$KUBE.API.SERVER.URL}http://localhost:8086/metrics
{$KUBE.API.TOKEN}

-

Items collected

NameDescriptionTypeIntervalKey and additional info
Kubernetes API: Request terminations, rateNumber of requests which apiserver terminated in self-defense per second.DEPENDENT

-

kubernetes.api.apiserver_request_terminations
Kubernetes API: API server requests: 0Counter of apiserver requests broken out for each HTTP response code.DEPENDENT

-

kubernetes.api.apiserver_request_total_0.rate
Kubernetes API: API server requests: 2xx, rateCounter of apiserver requests broken out for each HTTP response code.DEPENDENT

-

kubernetes.api.apiserver_request_total_200.rate
Kubernetes API: API server requests: 3xx, rateCounter of apiserver requests broken out for each HTTP response code.DEPENDENT

-

kubernetes.api.apiserver_request_total_300.rate
Kubernetes API: API server requests: 4xx, rateCounter of apiserver requests broken out for each HTTP response code.DEPENDENT

-

kubernetes.api.apiserver_request_total_400.rate
Kubernetes API: API server requests: 5xx, rateCounter of apiserver requests broken out for each HTTP response code.DEPENDENT

-

kubernetes.api.apiserver_request_total_500.rate
Kubernetes API: TLS handshake errors, rateNumber of requests dropped with 'TLS handshake error from' error per second.DEPENDENT

-

kubernetes.api.apiserver_tls_handshake_errors_total.rate
Kubernetes API: Audit events, totalAccumulated number audit events generated and sent to the audit backend.DEPENDENT

-

kubernetes.api.audit_event_total
Kubernetes API: CPUTotal user and system CPU usage ratio.DEPENDENT

-

kubernetes.api.cpu.util
Kubernetes API: Get API instance metricsGet raw metrics from API instance /metrics endpoint.HTTP_AGENT

-

kubernetes.api.get_metrics
Kubernetes API: GoroutinesNumber of goroutines that currently exist.DEPENDENT

-

kubernetes.api.go_goroutines
Kubernetes API: Go threadsNumber of OS threads created.DEPENDENT

-

kubernetes.api.go_threads
Kubernetes API: gRPCs messages ressived, rateTotal number of gRPC stream messages received per second.DEPENDENT

-

kubernetes.api.grpc_client_msg_received.rate
Kubernetes API: gRPCs messages sent, rateTotal number of gRPC stream messages sent per second.DEPENDENT

-

kubernetes.api.grpc_client_msg_sent.rate
Kubernetes API: gRPCs client started, rateTotal number of RPCs started per second.DEPENDENT

-

kubernetes.api.grpc_client_started.rate
Kubernetes API: Fds maxMaximum allowed open file descriptors.DEPENDENT

-

kubernetes.api.max_fds
Kubernetes API: Fds openNumber of open file descriptors.DEPENDENT

-

kubernetes.api.open_fds
Kubernetes API: Resident memory, bytesResident memory size in bytes.DEPENDENT

-

kubernetes.api.process_resident_memory_bytes
Kubernetes API: Virtual memory, bytesVirtual memory size in bytes.DEPENDENT

-

kubernetes.api.process_virtual_memory_bytes
Kubernetes API: HTTP requests: 2xx, rateNumber of HTTP requests with 2xx status code per second.DEPENDENT

-

kubernetes.api.rest_client_requests_total_200.rate
Kubernetes API: HTTP requests: 3xx, rateNumber of HTTP requests with 3xx status code per second.DEPENDENT

-

kubernetes.api.rest_client_requests_total_300.rate
Kubernetes API: HTTP requests: 4xx, rateNumber of HTTP requests with 4xx status code per second.DEPENDENT

-

kubernetes.api.rest_client_requests_total_400.rate
Kubernetes API: HTTP requests: 5xx, rateNumber of HTTP requests with 5xx status code per second.DEPENDENT

-

kubernetes.api.rest_client_requests_total_500.rate

Triggers

NameDescriptionExpressionPriorityDependencies
Kubernetes API: Too many server errors"Kubernetes API server is experiencing high error rate (with 5xx HTTP code).min(/Kubernetes API server by HTTP/kubernetes.api.apiserver_request_total_500.rate,5m)>{$KUBE.API.HTTP.SERVER.ERROR}WARNING 📢Kubernetes API: API server requests: 5xx, rate
Kubernetes API: Too many client errors"Kubernetes API client is experiencing high error rate (with 5xx HTTP code).min(/Kubernetes API server by HTTP/kubernetes.api.rest_client_requests_total_500.rate,5m)>{$KUBE.API.HTTP.CLIENT.ERROR}WARNING 📢Kubernetes API: HTTP requests: 5xx, rate

Discovery rule №1

NameDescriptionTypeIntervalKey and additional info
Watchers metrics discoveryDiscovery watchers by kind.DEPENDENT0kubernetes.api.apiserver_registered_watchers.discovery

Item prototypes

NameDescriptionTypeIntervalKey and additional info
Kubernetes API: Watchers: {#KIND}Number of currently registered watchers for a given resource.DEPENDENT

-

kubernetes.api.apiserver_registered_watchers["{#KIND}"]

Discovery rule №2

NameDescriptionTypeIntervalKey and additional info
Authentication requests discoveryDiscovery authentication attempts by name.DEPENDENT0kubernetes.api.authenticated_user_requests.discovery

Item prototypes

NameDescriptionTypeIntervalKey and additional info
Kubernetes API: Authenticated requests: {#NAME}, rateCounter of authenticated requests broken out by username per second.DEPENDENT

-

kubernetes.api.authenticated_user_requests.rate["{#NAME}"]

Discovery rule №3

NameDescriptionTypeIntervalKey and additional info
Authentication attempts discoveryDiscovery authentication attempts by result.DEPENDENT0kubernetes.api.authentication_attempts.discovery

Item prototypes

NameDescriptionTypeIntervalKey and additional info
Kubernetes API: Authentication attempts: {#RESULT}, rateAuthentication attempts by result per second.DEPENDENT

-

kubernetes.api.authentication_attempts.rate["{#RESULT}"]

Discovery rule №4

NameDescriptionTypeIntervalKey and additional info
Client certificate expiration histogramDiscovery raw data of client certificate expirationDEPENDENT0kubernetes.api.certificate_expiration.discovery

Item prototypes

NameDescriptionTypeIntervalKey and additional info
Kubernetes API: Client certificate expiration, p11 percentile of of the remaining lifetime on the certificate used to authenticate a request.CALCULATED

-

kubernetes.api.client_certificate_expiration_p1[{#SINGLETON}]
Kubernetes API: Certificate expiration seconds bucket, {#LE}Distribution of the remaining lifetime on the certificate used to authenticate a request.DEPENDENT

-

kubernetes.api.client_certificate_expiration_seconds_bucket[{#LE}]

Trigger prototypes

NameDescriptionExpressionPriorityDependencies
Kubernetes API: Kubernetes client certificate expires soonA client certificate used to authenticate to the apiserver is expiring in less than 24.0 hours.last(/Kubernetes API server by HTTP/kubernetes.api.client_certificate_expiration_p1[{#SINGLETON}]) > 0 and last(/Kubernetes API server by HTTP/kubernetes.api.client_certificate_expiration_p1[{#SINGLETON}]) < 246060WARNING 📢Kubernetes API: Client certificate expiration, p1
Kubernetes API: Kubernetes client certificate is expiringA client certificate used to authenticate to the apiserver is expiring in {$KUBE.API.CERT.EXPIRATION} days.last(/Kubernetes API server by HTTP/kubernetes.api.client_certificate_expiration_p1[{#SINGLETON}]) > 0 and last(/Kubernetes API server by HTTP/kubernetes.api.client_certificate_expiration_p1[{#SINGLETON}]) < {$KUBE.API.CERT.EXPIRATION}2460*60WARNING 📢Kubernetes API: Client certificate expiration, p1

Discovery rule №5

NameDescriptionTypeIntervalKey and additional info
Etcd objects metrics discoveryDiscovery etcd objects by resource.DEPENDENT0kubernetes.api.etcd_object_counts.discovery

Item prototypes

NameDescriptionTypeIntervalKey and additional info
Kubernetes API: etcd objects: {#RESOURCE}Number of stored objects at the time of last check split by kind.DEPENDENT

-

kubernetes.api.etcd_object_counts["{#RESOURCE}"]

Discovery rule №6

NameDescriptionTypeIntervalKey and additional info
gRPC completed requests discoveryDiscovery grpc completed requests by grpc code.DEPENDENT0kubernetes.api.grpc_client_handled.discovery

Item prototypes

NameDescriptionTypeIntervalKey and additional info
Kubernetes API: gRPCs completed: {#GRPC_CODE}, rateTotal number of RPCs completed by the client regardless of success or failure per second.DEPENDENT

-

kubernetes.api.grpc_client_handled_total.rate["{#GRPC_CODE}"]

Discovery rule №7

NameDescriptionTypeIntervalKey and additional info
Requests inflight discoveryDiscovery requests inflight by kind.DEPENDENT0kubernetes.api.inflight_requests.discovery

Item prototypes

NameDescriptionTypeIntervalKey and additional info
Kubernetes API: Requests current: {#KIND}Maximal number of currently used inflight request limit of this apiserver per request kind in last second.DEPENDENT

-

kubernetes.api.current_inflight_requests["{#KIND}"]

Discovery rule №8

NameDescriptionTypeIntervalKey and additional info
Long-running requestsDiscovery of long-running requests by verb, resource and scope.DEPENDENT0kubernetes.api.longrunning_gauge.discovery

Item prototypes

NameDescriptionTypeIntervalKey and additional info
Kubernetes API: Long-running ["{#VERB}"] requests ["{#RESOURCE}"]: {#SCOPE}Gauge of all active long-running apiserver requests broken out by verb, resource and scope. Not all requests are tracked this way.DEPENDENT

-

kubernetes.api.longrunning_gauge["{#RESOURCE}","{#SCOPE}","{#VERB}"]

Discovery rule №9

NameDescriptionTypeIntervalKey and additional info
Request duration histogramDiscovery raw data and percentile items of request duration.DEPENDENT0kubernetes.api.requests_bucket.discovery

Item prototypes

NameDescriptionTypeIntervalKey and additional info
Kubernetes API: ["{#VERB}"] Requests bucket: {#LE}Response latency distribution in seconds for each verb.DEPENDENT

-

kubernetes.api.request_duration_seconds_bucket[{#LE},"{#VERB}"]
Kubernetes API: ["{#VERB}"] Requests, p5050 percentile of response latency distribution in seconds for each verb.CALCULATED

-

kubernetes.api.request_duration_seconds_p50["{#VERB}"]
Kubernetes API: ["{#VERB}"] Requests, p9090 percentile of response latency distribution in seconds for each verb.CALCULATED

-

kubernetes.api.request_duration_seconds_p90["{#VERB}"]
Kubernetes API: ["{#VERB}"] Requests, p9595 percentile of response latency distribution in seconds for each verb.CALCULATED

-

kubernetes.api.request_duration_seconds_p95["{#VERB}"]
Kubernetes API: ["{#VERB}"] Requests, p9999 percentile of response latency distribution in seconds for each verb.CALCULATED

-

kubernetes.api.request_duration_seconds_p99["{#VERB}"]

Discovery rule №10

NameDescriptionTypeIntervalKey and additional info
Workqueue metrics discoveryDiscovery workqueue metrics by name.DEPENDENT0kubernetes.api.workqueue.discovery

Item prototypes

NameDescriptionTypeIntervalKey and additional info
Kubernetes API: ["{#NAME}"] Workqueue adds total, rateTotal number of adds handled by workqueue per second.DEPENDENT

-

kubernetes.api.workqueue_adds_total.rate["{#NAME}"]
Kubernetes API: ["{#NAME}"] Workqueue depthCurrent depth of workqueue.DEPENDENT

-

kubernetes.api.workqueue_depth["{#NAME}"]