- Essentials
- In The App
- Infrastructure
- Application Performance
- Log Management
- Security Platform
- UX Monitoring
- Administration
Get metrics from Kubernetes service in real-time to:
The Kubernetes State Metrics Core check leverages kube-state-metrics version 2+ and includes major performance and tagging improvements compared to the legacy kubernetes_state
check.
As opposed to the legacy check, with the Kubernetes State Metrics Core check, you no longer need to deploy kube-state-metrics
in your cluster.
Kubernetes State Metrics Core provides a better alternative to the legacy kubernetes_state
check as it offers more granular metrics and tags. See the Major Changes and Data Collected for more details.
The Kubernetes State Metrics Core check is included in the Datadog Cluster Agent image, so you don’t need to install anything else on your Kubernetes servers.
In your Helm values.yaml
, add the following:
...
datadog:
...
kubeStateMetricsCore:
enabled: true
...
To enable the kubernetes_state_core
check, the setting spec.features.kubeStateMetricsCore.enabled
must be set to true
in the DatadogAgent resource:
apiVersion: datadoghq.com/v1alpha1
kind: DatadogAgent
metadata:
name: datadog
spec:
credentials:
apiKey: <DATADOG_API_KEY>
appKey: <DATADOG_APP_KEY>
features:
kubeStateMetricsCore:
enabled: true
# ...
Note: Datadog Operator v0.7.0 or greater is required.
In the original kubernetes_state
check, several tags have been flagged as deprecated and replaced by new tags. To determine your migration path, check which tags are submitted with your metrics.
In the kubernetes_state_core
check, only the non-deprecated tags are submitted. Before migrating from kubernetes_state
to kubernetes_state_core
, verify that only official tags are used in monitors and dashboards.
Here is the mapping between deprecated tags and the official tags that have replaced them:
deprecated tag | official tag |
---|---|
cluster_name | kube_cluster_name |
container | kube_container_name |
cronjob | kube_cronjob |
daemonset | kube_daemon_set |
deployment | kube_deployment |
image | image_name |
job | kube_job |
job_name | kube_job |
namespace | kube_namespace |
phase | pod_phase |
pod | pod_name |
replicaset | kube_replica_set |
replicationcontroller | kube_replication_controller |
statefulset | kube_stateful_set |
The Kubernetes State Metrics Core check is not backward compatible, be sure to read the changes carefully before migrating from the legacy kubernetes_state
check.
kubernetes_state.node.by_condition
kubernetes_state.nodes.by_condition
.kubernetes_state.persistentvolume.by_phase
kubernetes_state.persistentvolumes.by_phase
.kubernetes_state.pod.status_phase
pod_name
.kubernetes_state.node.count
host
anymore. It aggregates the nodes count by kernel_version
os_image
container_runtime_version
kubelet_version
.kubernetes_state.container.waiting
and kubernetes_state.container.status_report.count.waiting
kube_job
kubernetes_state
, the kube_job
tag value is the CronJob
name if the Job
had CronJob
as an owner, otherwise it is the Job
name. In kubernetes_state_core
, the kube_job
tag value is always the Job
name, and a new kube_cronjob
tag key is added with the CronJob
name as the tag value. When migrating to kubernetes_state_core
, it’s recommended to use the new tag or kube_job:foo*
, where foo
is the CronJob
name, for query filters.Enabling kubeStateMetricsCore
in your Helm values.yaml
configures the Agent to ignore the auto configuration file for legacy kubernetes_state
check. The goal is to avoid running both checks simultaneously.
If you still want to enable both checks simultaneously for the migration phase, disable the ignoreLegacyKSMCheck
field in your values.yaml
.
Note: ignoreLegacyKSMCheck
makes the Agent only ignore the auto configuration for the legacy kubernetes_state
check. Custom kubernetes_state
configurations need to be removed manually.
The Kubernetes State Metrics Core check does not require deploying kube-state-metrics
in your cluster anymore, you can disable deploying kube-state-metrics
as part of the Datadog Helm Chart. To do this, add the following in your Helm values.yaml
:
...
datadog:
...
kubeStateMetricsEnabled: false
...
Important Note: The Kubernetes State Metrics Core check is an alternative to the legacy kubernetes_state
check. Datadog recommends not enabling both checks simultaneously to guarantee consistent metrics.
kubernetes_state.daemonset.count
kube_namespace
.kubernetes_state.daemonset.scheduled
kube_daemon_set
kube_namespace
(env
service
version
from standard labels).kubernetes_state.daemonset.desired
kube_daemon_set
kube_namespace
(env
service
version
from standard labels).kubernetes_state.daemonset.misscheduled
kube_daemon_set
kube_namespace
(env
service
version
from standard labels).kubernetes_state.daemonset.ready
kube_daemon_set
kube_namespace
(env
service
version
from standard labels).kubernetes_state.daemonset.updated
kube_daemon_set
kube_namespace
(env
service
version
from standard labels).kubernetes_state.daemonset.daemons_unavailable
kube_daemon_set
kube_namespace
(env
service
version
from standard labels).kubernetes_state.daemonset.daemons_available
kube_daemon_set
kube_namespace
(env
service
version
from standard labels).kubernetes_state.deployment.count
kube_namespace
.kubernetes_state.deployment.paused
kube_deployment
kube_namespace
(env
service
version
from standard labels).kubernetes_state.deployment.replicas_desired
kube_deployment
kube_namespace
(env
service
version
from standard labels).kubernetes_state.deployment.rollingupdate.max_unavailable
kube_deployment
kube_namespace
(env
service
version
from standard labels).kubernetes_state.deployment.rollingupdate.max_surge
kube_deployment
kube_namespace
(env
service
version
from standard labels).kubernetes_state.deployment.replicas
kube_deployment
kube_namespace
(env
service
version
from standard labels).kubernetes_state.deployment.replicas_available
kube_deployment
kube_namespace
(env
service
version
from standard labels).kubernetes_state.deployment.replicas_unavailable
kube_deployment
kube_namespace
(env
service
version
from standard labels).kubernetes_state.deployment.replicas_updated
kube_deployment
kube_namespace
(env
service
version
from standard labels).kubernetes_state.deployment.condition
kube_deployment
kube_namespace
(env
service
version
from standard labels).kubernetes_state.endpoint.count
kube_namespace
.kubernetes_state.endpoint.address_available
endpoint
kube_namespace
.kubernetes_state.endpoint.address_not_ready
endpoint
kube_namespace
.kubernetes_state.namespace.count
phase
.kubernetes_state.node.count
kernel_version
os_image
container_runtime_version
kubelet_version
.kubernetes_state.node.cpu_allocatable
node
resource
unit
.kubernetes_state.node.memory_allocatable
node
resource
unit
.kubernetes_state.node.pods_allocatable
node
resource
unit
.kubernetes_state.node.ephemeral_storage_allocatable
node
resource
unit
.kubernetes_state.node.cpu_capacity
node
resource
unit
.kubernetes_state.node.memory_capacity
node
resource
unit
.kubernetes_state.node.pods_capacity
node
resource
unit
.kubernetes_state.node.ephemeral_storage_capacity
node
resource
unit
.kubernetes_state.node.by_condition
condition
node
status
.kubernetes_state.node.status
node
status
.kubernetes_state.node.age
node
.kubernetes_state.container.terminated
kube_namespace
pod_name
kube_container_name
(env
service
version
from standard labels).kubernetes_state.container.cpu_limit
kube_namespace
pod_name
kube_container_name
node
resource
unit
(env
service
version
from standard labels).kubernetes_state.container.memory_limit
kube_namespace
pod_name
kube_container_name
node
resource
unit
(env
service
version
from standard labels).kubernetes_state.container.cpu_requested
kube_namespace
pod_name
kube_container_name
node
resource
unit
(env
service
version
from standard labels).kubernetes_state.container.memory_requested
kube_namespace
pod_name
kube_container_name
node
resource
unit
(env
service
version
from standard labels).kubernetes_state.container.ready
kube_namespace
pod_name
kube_container_name
(env
service
version
from standard labels).kubernetes_state.container.restarts
kube_namespace
pod_name
kube_container_name
(env
service
version
from standard labels).kubernetes_state.container.running
kube_namespace
pod_name
kube_container_name
(env
service
version
from standard labels).kubernetes_state.container.waiting
kube_namespace
pod_name
kube_container_name
(env
service
version
from standard labels).kubernetes_state.container.status_report.count.waiting
kube_namespace
pod_name
kube_container_name
reason
(env
service
version
from standard labels).kubernetes_state.container.status_report.count.terminated
kube_namespace
pod_name
kube_container_name
reason
(env
service
version
from standard labels).kubernetes_state.pod.ready
kube_namespace
pod_name
condition
(env
service
version
from standard labels).kubernetes_state.pod.scheduled
kube_namespace
pod_name
condition
(env
service
version
from standard labels).kubernetes_state.pod.volumes.persistentvolumeclaims_readonly
kube_namespace
pod_name
volume
persistentvolumeclaim
(env
service
version
from standard labels).kubernetes_state.pod.unschedulable
kube_namespace
pod_name
(env
service
version
from standard labels).kubernetes_state.pod.status_phase
kube_namespace
pod_name
pod_phase
(env
service
version
from standard labels).kubernetes_state.pod.age
kube_namespace
pod_name
pod_phase
(env
service
version
from standard labels).kubernetes_state.pod.uptime
kube_namespace
pod_name
pod_phase
(env
service
version
from standard labels).kubernetes_state.pod.count
kube_namespace
kube_<owner kind>
.kubernetes_state.persistentvolumeclaim.status
kube_namespace
persistentvolumeclaim
phase
storageclass
.kubernetes_state.persistentvolumeclaim.access_mode
kube_namespace
persistentvolumeclaim
access_mode
storageclass
.kubernetes_state.persistentvolumeclaim.request_storage
kube_namespace
persistentvolumeclaim
storageclass
.kubernetes_state.persistentvolume.capacity
persistentvolume
storageclass
.kubernetes_state.persistentvolume.by_phase
persistentvolume
storageclass
phase
.kubernetes_state.pdb.pods_healthy
kube_namespace
poddisruptionbudget
.kubernetes_state.pdb.pods_desired
kube_namespace
poddisruptionbudget
.kubernetes_state.pdb.disruptions_allowed
kube_namespace
poddisruptionbudget
.kubernetes_state.pdb.pods_total
kube_namespace
poddisruptionbudget
.kubernetes_state.secret.type
kube_namespace
secret
type
.kubernetes_state.replicaset.count
kube_namespace
kube_deployment
.kubernetes_state.replicaset.replicas_desired
kube_namespace
kube_replica_set
(env
service
version
from standard labels).kubernetes_state.replicaset.fully_labeled_replicas
kube_namespace
kube_replica_set
(env
service
version
from standard labels).kubernetes_state.replicaset.replicas_ready
kube_namespace
kube_replica_set
(env
service
version
from standard labels).kubernetes_state.replicaset.replicas
kube_namespace
kube_replica_set
(env
service
version
from standard labels).kubernetes_state.replicationcontroller.replicas_desired
kube_namespace
kube_replication_controller
.kubernetes_state.replicationcontroller.replicas_available
kube_namespace
kube_replication_controller
.kubernetes_state.replicationcontroller.fully_labeled_replicas
kube_namespace
kube_replication_controller
.kubernetes_state.replicationcontroller.replicas_ready
kube_namespace
kube_replication_controller
.kubernetes_state.replicationcontroller.replicas
kube_namespace
kube_replication_controller
.kubernetes_state.statefulset.count
kube_namespace
.kubernetes_state.statefulset.replicas_desired
kube_namespace
kube_stateful_set
(env
service
version
from standard labels).kubernetes_state.statefulset.replicas
kube_namespace
kube_stateful_set
(env
service
version
from standard labels).kubernetes_state.statefulset.replicas_current
kube_namespace
kube_stateful_set
(env
service
version
from standard labels).kubernetes_state.statefulset.replicas_ready
kube_namespace
kube_stateful_set
(env
service
version
from standard labels).kubernetes_state.statefulset.replicas_updated
kube_namespace
kube_stateful_set
(env
service
version
from standard labels).kubernetes_state.hpa.count
kube_namespace
.kubernetes_state.hpa.min_replicas
kube_namespace
horizontalpodautoscaler
.kubernetes_state.hpa.max_replicas
kube_namespace
horizontalpodautoscaler
.kubernetes_state.hpa.condition
kube_namespace
horizontalpodautoscaler
condition
status
.kubernetes_state.hpa.desired_replicas
kube_namespace
horizontalpodautoscaler
.kubernetes_state.hpa.current_replicas
kube_namespace
horizontalpodautoscaler
.kubernetes_state.hpa.spec_target_metric
kube_namespace
horizontalpodautoscaler
metric_name
metric_target_type
.kubernetes_state.vpa.count
kube_namespace
.kubernetes_state.vpa.lower_bound
kube_namespace
verticalpodautoscaler
kube_container_name
resource
target_api_version
target_kind
target_name
unit
.kubernetes_state.vpa.target
kube_namespace
verticalpodautoscaler
kube_container_name
resource
target_api_version
target_kind
target_name
unit
.kubernetes_state.vpa.uncapped_target
kube_namespace
verticalpodautoscaler
kube_container_name
resource
target_api_version
target_kind
target_name
unit
.kubernetes_state.vpa.upperbound
kube_namespace
verticalpodautoscaler
kube_container_name
resource
target_api_version
target_kind
target_name
unit
.kubernetes_state.vpa.update_mode
kube_namespace
verticalpodautoscaler
target_api_version
target_kind
target_name
update_mode
.kubernetes_state.vpa.spec_container_minallowed
kube_namespace
verticalpodautoscaler
kube_container_name
resource
target_api_version
target_kind
target_name
unit
.kubernetes_state.vpa.spec_container_maxallowed
kube_namespace
verticalpodautoscaler
kube_container_name
resource
target_api_version
target_kind
target_name
unit
.kubernetes_state.cronjob.count
kube_namespace
.kubernetes_state.cronjob.spec_suspend
kube_namespace
kube_cronjob
(env
service
version
from standard labels).kubernetes_state.cronjob.duration_since_last_schedule
kube_cronjob
kube_namespace
(env
service
version
from standard labels).kubernetes_state.job.count
kube_namespace
kube_cronjob
.kubernetes_state.job.failed
kube_job
or kube_cronjob
kube_namespace
(env
service
version
from standard labels).kubernetes_state.job.succeeded
kube_job
or kube_cronjob
kube_namespace
(env
service
version
from standard labels).kubernetes_state.job.completion.succeeded
kube_job
or kube_cronjob
kube_namespace
(env
service
version
from standard labels).kubernetes_state.job.completion.failed
kube_job
or kube_cronjob
kube_namespace
(env
service
version
from standard labels).kubernetes_state.resourcequota.<resource>.limit
kube_namespace
resourcequota
.kubernetes_state.resourcequota.<resource>.used
kube_namespace
resourcequota
.kubernetes_state.limitrange.cpu.min
kube_namespace
limitrange
type
.kubernetes_state.limitrange.cpu.max
kube_namespace
limitrange
type
.kubernetes_state.limitrange.cpu.default
kube_namespace
limitrange
type
.kubernetes_state.limitrange.cpu.default_request
kube_namespace
limitrange
type
.kubernetes_state.limitrange.cpu.max_limit_request_ratio
kube_namespace
limitrange
type
.kubernetes_state.limitrange.memory.min
kube_namespace
limitrange
type
.kubernetes_state.limitrange.memory.max
kube_namespace
limitrange
type
.kubernetes_state.limitrange.memory.default
kube_namespace
limitrange
type
.kubernetes_state.limitrange.memory.default_request
kube_namespace
limitrange
type
.kubernetes_state.limitrange.memory.max_limit_request_ratio
kube_namespace
limitrange
type
.kubernetes_state.service.count
kube_namespace
type
.kubernetes_state.service.type
kube_namespace
kube_service
type
.Note: You can configure Datadog Standard labels on your Kubernetes objects to get the env
service
version
tags.
The Kubernetes State Metrics Core check does not include any events.
kubernetes_state.cronjob.complete
kube_cronjob
kube_namespace
(env
service
version
from standard labels).kubernetes_state.cronjob.on_schedule_check
kube_cronjob
kube_namespace
(env
service
version
from standard labels).kubernetes_state.job.complete
kube_job
or kube_cronjob
kube_namespace
(env
service
version
from standard labels).kubernetes_state.node.ready
node
condition
status
.kubernetes_state.node.out_of_disk
node
condition
status
.kubernetes_state.node.disk_pressure
node
condition
status
.kubernetes_state.node.network_unavailable
node
condition
status
.kubernetes_state.node.memory_pressure
node
condition
status
.Run the Cluster Agent’s status
subcommand inside your Cluster Agent container and look for kubernetes_state_core
under the Checks section.
Need help? Contact Datadog support.
Additional helpful documentation, links, and articles: