Datadog-Kubernetes state Integration

Overview

Get metrics from kubernetes_state service in real time to:

  • Visualize and monitor kubernetes_state states
  • Be notified about kubernetes_state failovers and events.

Setup

Installation

Install the dd-check-kubernetes_state package manually or with your favorite configuration manager

Configuration

Edit the kubernetes_state.yaml file to point to your server and port, set the masters to monitor

Validation

When you run datadog-agent info you should see something like the following:

Checks
======

    kubernetes_state
    -----------
      - instance #0 [OK]
      - Collected 39 metrics, 0 events & 7 service checks

Compatibility

The kubernetes_state check is compatible with all major platforms

Data Collected

Metrics

kubernetes_state.container.ready
(gauge)
Whether the containers readiness check succeeded
shown as
kubernetes_state.container.running
(gauge)
Whether the container is currently in running state
shown as
kubernetes_state.container.terminated
(gauge)
Whether the container is currently in terminated state
shown as
kubernetes_state.container.status_report.count.terminated
(count)
Count of the containers currently reporting a in terminated state with the reason as a tag
shown as
kubernetes_state.container.waiting
(gauge)
Whether the container is currently in waiting state
shown as
kubernetes_state.container.status_report.count.waiting
(count)
Count of the containers currently reporting a in waiting state with the reason as a tag
shown as
kubernetes_state.container.gpu.request
(gauge)
0
shown as
kubernetes_state.container.gpu.limit
(gauge)
0
shown as
kubernetes_state.container.restarts
(gauge)
The number of restarts per container
shown as
kubernetes_state.container.cpu_requested
(gauge)
The number of requested cpu cores by a container
shown as cpu
kubernetes_state.container.memory_requested
(gauge)
The number of requested memory bytes by a container
shown as byte
kubernetes_state.container.cpu_limit
(gauge)
The limit on cpu cores to be used by a container
shown as cpu
kubernetes_state.container.memory_limit
(gauge)
The limit on memory to be used by a container
shown as byte
kubernetes_state.daemonset.scheduled
(gauge)
The number of nodes running at least one daemon pod and that are supposed to
shown as
kubernetes_state.daemonset.misscheduled
(gauge)
The number of nodes running a daemon pod but are not supposed to
shown as
kubernetes_state.daemonset.desired
(gauge)
The number of nodes that should be running the daemon pod
shown as
kubernetes_state.daemonset.ready
(gauge)
The number of nodes that should be running the daemon pod and have one or more running and ready
shown as
kubernetes_state.deployment.replicas
(gauge)
The number of replicas per deployment
shown as
kubernetes_state.deployment.replicas_available
(gauge)
The number of available replicas per deployment
shown as
kubernetes_state.deployment.replicas_unavailable
(gauge)
The number of unavailable replicas per deployment
shown as
kubernetes_state.deployment.replicas_updated
(gauge)
The number of updated replicas per deployment
shown as
kubernetes_state.deployment.replicas_desired
(gauge)
The number of desired replicas per deployment
shown as
kubernetes_state.deployment.paused
(gauge)
Whether a deployment is paused
shown as
kubernetes_state.deployment.rollingupdate.max_unavailable
(gauge)
Maximum number of unavailable replicas during a rolling update
shown as
kubernetes_state.job.status.failed
(counter)
Observed number of failed pods in a job
shown as
kubernetes_state.job.status.succeeded
(counter)
Observed number of succeeded pods in a job
shown as
kubernetes_state.limitrange.cpu.min
(gauge)
Minimum CPU request for this type
shown as
kubernetes_state.limitrange.cpu.max
(gauge)
Maximum CPU limit for this type
shown as
kubernetes_state.limitrange.cpu.default
(gauge)
Default CPU limit if not specified
shown as
kubernetes_state.limitrange.cpu.default_request
(gauge)
Default CPU request if not specified
shown as
kubernetes_state.limitrange.cpu.max_limit_request_ratio
(gauge)
Maximum CPU limit / request ratio
shown as
kubernetes_state.limitrange.memory.min
(gauge)
Minimum memory request for this type
shown as
kubernetes_state.limitrange.memory.max
(gauge)
Maximum memory limit for this type
shown as
kubernetes_state.limitrange.memory.default
(gauge)
Default memory limit if not specified
shown as
kubernetes_state.limitrange.memory.default_request
(gauge)
Default memory request if not specified
shown as
kubernetes_state.limitrange.memory.max_limit_request_ratio
(gauge)
Maximum memory limit / request ratio
shown as
kubernetes_state.node.cpu_capacity
(gauge)
The total CPU resources of the node
shown as cpu
kubernetes_state.node.memory_capacity
(gauge)
The total memory resources of the node
shown as byte
kubernetes_state.node.pods_capacity
(gauge)
The total pod resources of the node
shown as
kubernetes_state.node.gpu.cards_allocatable
(gauge)
0
shown as
kubernetes_state.node.gpu.cards_capacity
(gauge)
0
shown as
kubernetes_state.persistentvolumeclaim.status
(gauge)
-1
shown as
kubernetes_state.node.cpu_allocatable
(gauge)
The CPU resources of a node that are available for scheduling
shown as cpu
kubernetes_state.node.memory_allocatable
(gauge)
The memory resources of a node that are available for scheduling
shown as byte
kubernetes_state.node.pods_allocatable
(gauge)
The pod resources of a node that are available for scheduling
shown as
kubernetes_state.node.status
(gauge)
Submitted with a value of 1 for each node and tagged either 'status:schedulable' or 'status:unschedulable'; Sum this metric by either status to get the number of nodes in that status.
shown as
kubernetes_state.hpa.min_replicas
(gauge)
Lower limit for the number of pods that can be set by the autoscaler
shown as
kubernetes_state.hpa.max_replicas
(gauge)
Upper limit for the number of pods that can be set by the autoscaler
shown as
kubernetes_state.hpa.target_cpu
(gauge)
Target CPU percentage of pods managed by this autoscaler
shown as
kubernetes_state.hpa.desired_replicas
(gauge)
Desired number of replicas of pods managed by this autoscaler
shown as
kubernetes_state.pod.ready
(gauge)
Whether the pod is ready to serve requests
shown as
kubernetes_state.pod.scheduled
(gauge)
Reports the status of the scheduling process for the pod with its tags
shown as
kubernetes_state.replicaset.replicas
(gauge)
The number of replicas per ReplicaSet
shown as
kubernetes_state.replicaset.fully_labeled_replicas
(gauge)
The number of fully labeled replicas per ReplicaSet
shown as
kubernetes_state.replicaset.replicas_ready
(gauge)
The number of ready replicas per ReplicaSet
shown as
kubernetes_state.replicaset.replicas_desired
(gauge)
Number of desired pods for a ReplicaSet
shown as
kubernetes_state.replicationcontroller.replicas
(gauge)
The number of replicas per ReplicationController
shown as
kubernetes_state.replicationcontroller.fully_labeled_replicas
(gauge)
The number of fully labeled replicas per ReplicationController
shown as
kubernetes_state.replicationcontroller.replicas_ready
(gauge)
The number of ready replicas per ReplicationController
shown as
kubernetes_state.replicationcontroller.replicas_desired
(gauge)
Number of desired replicas for a ReplicationController
shown as
kubernetes_state.replicationcontroller.replicas_available
(gauge)
The number of available replicas per ReplicationController
shown as
kubernetes_state.resourcequota.pods.used
(gauge)
Observed number of pods used for a resource quota
shown as
kubernetes_state.resourcequota.services.used
(gauge)
Observed number of services used for a resource quota
shown as
kubernetes_state.resourcequota.persistentvolumeclaims.used
(gauge)
Observed number of persistent volume claims used for a resource quota
shown as
kubernetes_state.resourcequota.services.nodeports.used
(gauge)
Observed number of node ports used for a resource quota
shown as
kubernetes_state.resourcequota.services.loadbalancers.used
(gauge)
Observed number of loadbalancers used for a resource quota
shown as
kubernetes_state.resourcequota.requests.cpu.used
(gauge)
Observed sum of CPU cores requested for a resource quota
shown as cpu
kubernetes_state.resourcequota.requests.memory.used
(gauge)
Observed sum of memory bytes requested for a resource quota
shown as byte
kubernetes_state.resourcequota.requests.storage.used
(gauge)
Observed sum of storage bytes requested for a resource quota
shown as byte
kubernetes_state.resourcequota.limits.cpu.used
(gauge)
Observed sum of limits for CPU cores for a resource quota
shown as cpu
kubernetes_state.resourcequota.limits.memory.used
(gauge)
Observed sum of limits for memory bytes for a resource quota
shown as byte
kubernetes_state.resourcequota.pods.limit
(gauge)
Hard limit of the number of pods for a resource quota
shown as
kubernetes_state.resourcequota.services.limit
(gauge)
Hard limit of the number of services for a resource quota
shown as
kubernetes_state.resourcequota.persistentvolumeclaims.limit
(gauge)
Hard limit of the number of PVC for a resource quota
shown as
kubernetes_state.resourcequota.services.nodeports.limit
(gauge)
Hard limit of the number of node ports for a resource quota
shown as
kubernetes_state.resourcequota.services.loadbalancers.limit
(gauge)
Hard limit of the number of loadbalancers for a resource quota
shown as
kubernetes_state.resourcequota.requests.cpu.limit
(gauge)
Hard limit on the total of CPU core requested for a resource quota
shown as cpu
kubernetes_state.resourcequota.requests.memory.limit
(gauge)
Hard limit on the total of memory bytes requested for a resource quota
shown as byte
kubernetes_state.resourcequota.requests.storage.limit
(gauge)
Hard limit on the total of storage bytes requested for a resource quota
shown as byte
kubernetes_state.resourcequota.limits.cpu.limit
(gauge)
Hard limit on the sum of CPU core limits for a resource quota
shown as cpu
kubernetes_state.resourcequota.limits.memory.limit
(gauge)
Hard limit on the sum of memory bytes limits for a resource quota
shown as byte
kubernetes_state.statefulset.replicas
(gauge)
The number of replicas per statefulset
shown as
kubernetes_state.statefulset.replicas_desired
(gauge)
The number of desired replicas per statefulset
shown as

Events

The Kubernetes-state check does not include any event at this time.

Service Checks

The Kubernetes-state check does not include any service check at this time.