---
title: Kubelet
description: Collects container stats from kubelet.
breadcrumbs: Docs > Integrations > Kubelet
---

# Kubelet
Supported OS Integration version10.4.0
## Overview{% #overview %}

This integration gets container metrics from kubelet

- Visualize and monitor kubelet stats
- Be notified about kubelet failovers and events.

**Minimum Agent version:** 6.0.2

## Setup{% #setup %}

### Installation{% #installation %}

The Kubelet check is included in the [Datadog Agent](https://app.datadoghq.com/account/settings/agent/latest) package, so you don't need to install anything else on your servers.

### Configuration{% #configuration %}

Edit the `kubelet.d/conf.yaml` file, in the `conf.d/` folder at the root of your [Agent's configuration directory](https://docs.datadoghq.com/agent/guide/agent-configuration-files.md#agent-configuration-directory). See the [sample kubelet.d/conf.yaml](https://github.com/DataDog/integrations-core/blob/master/kubelet/datadog_checks/kubelet/data/conf.yaml.default) for all available configuration options.

### Validation{% #validation %}

Run the [Agent's status subcommand](https://docs.datadoghq.com/agent/guide/agent-commands.md#agent-status-and-information) and look for `kubelet` under the Checks section.

### Compatibility{% #compatibility %}

The kubelet check can run in two modes:

- The default prometheus mode is compatible with Kubernetes version 1.7.6 or superior
- The cAdvisor mode (enabled by setting the `cadvisor_port` option) should be compatible with versions 1.3 and up. Consistent tagging and filtering requires at least version 6.2 of the Agent.

## OpenShift <3.7 support{% #openshift-37-support %}

The cAdvisor 4194 port is disabled by default on OpenShift. To enable it, you need to add the following lines to your [node-config file](https://docs.openshift.org/3.7/install_config/master_node_configuration.html#node-configuration-files):

```text
kubeletArguments:
  cadvisor-port: ["4194"]
```

If you cannot open the port, disable both sources of container metric collection, by setting:

- `cadvisor_port` to `0`
- `metrics_endpoint` to `""`

The check can still collect:

- kubelet health service checks
- pod running/stopped metrics
- pod limits and requests
- node capacity metrics

## Data Collected{% #data-collected %}

### Metrics{% #metrics %}

|  |
|  |
| **kubernetes.containers.last\_state.terminated**(gauge)             | The number of containers that were previously terminated                                                                                                              |
| **kubernetes.pods.running**(gauge)                                  | The number of running pods                                                                                                                                            |
| **kubernetes.pods.expired**(gauge)                                  | The number of expired pods the check ignored                                                                                                                          |
| **kubernetes.containers.running**(gauge)                            | The number of running containers                                                                                                                                      |
| **kubernetes.containers.restarts**(gauge)                           | The number of times the container has been restarted                                                                                                                  |
| **kubernetes.containers.state.terminated**(gauge)                   | The number of currently terminated containers                                                                                                                         |
| **kubernetes.containers.state.waiting**(gauge)                      | The number of currently waiting containers                                                                                                                            |
| **kubernetes.cpu.load.10s.avg**(gauge)                              | Container cpu load average over the last 10 seconds                                                                                                                   |
| **kubernetes.cpu.system.total**(gauge)                              | The number of cores used for system time*Shown as core*                                                                                                               |
| **kubernetes.cpu.user.total**(gauge)                                | The number of cores used for user time*Shown as core*                                                                                                                 |
| **kubernetes.cpu.cfs.periods**(gauge)                               | Number of elapsed enforcement period intervals                                                                                                                        |
| **kubernetes.cpu.cfs.throttled.periods**(gauge)                     | Number of throttled period intervals                                                                                                                                  |
| **kubernetes.cpu.cfs.throttled.seconds**(gauge)                     | Total time duration the container has been throttled                                                                                                                  |
| **kubernetes.cpu.capacity**(gauge)                                  | The number of cores in this machine (available until kubernetes v1.18)*Shown as core*                                                                                 |
| **kubernetes.cpu.usage.total**(gauge)                               | The number of cores used*Shown as nanocore*                                                                                                                           |
| **kubernetes.cpu.limits**(gauge)                                    | The limit of cpu cores set*Shown as core*                                                                                                                             |
| **kubernetes.cpu.requests**(gauge)                                  | The requested cpu cores*Shown as core*                                                                                                                                |
| **kubernetes.filesystem.usage**(gauge)                              | The amount of disk used*Shown as byte*                                                                                                                                |
| **kubernetes.filesystem.usage\_pct**(gauge)                         | The percentage of disk used*Shown as fraction*                                                                                                                        |
| **kubernetes.io.read\_bytes**(gauge)                                | The amount of bytes read from the disk*Shown as byte*                                                                                                                 |
| **kubernetes.io.write\_bytes**(gauge)                               | The amount of bytes written to the disk*Shown as byte*                                                                                                                |
| **kubernetes.memory.capacity**(gauge)                               | The amount of memory (in bytes) in this machine (available until kubernetes v1.18)*Shown as byte*                                                                     |
| **kubernetes.memory.limits**(gauge)                                 | The limit of memory set*Shown as byte*                                                                                                                                |
| **kubernetes.memory.sw\_limit**(gauge)                              | The limit of swap space set*Shown as byte*                                                                                                                            |
| **kubernetes.memory.requests**(gauge)                               | The requested memory*Shown as byte*                                                                                                                                   |
| **kubernetes.memory.usage**(gauge)                                  | Current memory usage in bytes including all memory regardless of when it was accessed*Shown as byte*                                                                  |
| **kubernetes.memory.working\_set**(gauge)                           | Current working set in bytes - this is what the OOM killer is watching for*Shown as byte*                                                                             |
| **kubernetes.memory.cache**(gauge)                                  | The amount of memory that is being used to cache data from disk (e.g. memory contents that can be associated precisely with a block on a block device)*Shown as byte* |
| **kubernetes.memory.rss**(gauge)                                    | Size of RSS in bytes*Shown as byte*                                                                                                                                   |
| **kubernetes.memory.swap**(gauge)                                   | The amount of swap currently used by by processes in this cgroup*Shown as byte*                                                                                       |
| **kubernetes.memory.usage\_pct**(gauge)                             | The percentage of memory used per pod (memory limit must be set)*Shown as fraction*                                                                                   |
| **kubernetes.memory.sw\_in\_use**(gauge)                            | The percentage of swap space used*Shown as fraction*                                                                                                                  |
| **kubernetes.network.rx\_bytes**(gauge)                             | The amount of bytes per second received*Shown as byte*                                                                                                                |
| **kubernetes.network.rx\_dropped**(gauge)                           | The amount of rx packets dropped per second*Shown as packet*                                                                                                          |
| **kubernetes.network.rx\_errors**(gauge)                            | The amount of rx errors per second*Shown as error*                                                                                                                    |
| **kubernetes.network.tx\_bytes**(gauge)                             | The amount of bytes per second transmitted*Shown as byte*                                                                                                             |
| **kubernetes.network.tx\_dropped**(gauge)                           | The amount of tx packets dropped per second*Shown as packet*                                                                                                          |
| **kubernetes.network.tx\_errors**(gauge)                            | The amount of tx errors per second*Shown as error*                                                                                                                    |
| **kubernetes.diskio.io\_service\_bytes.stats.total**(gauge)         | The amount of disk space the container uses*Shown as byte*                                                                                                            |
| **kubernetes.apiserver.certificate.expiration.count**(gauge)        | The count of remaining lifetime on the certificate used to authenticate a request*Shown as second*                                                                    |
| **kubernetes.apiserver.certificate.expiration.sum**(gauge)          | The sum of remaining lifetime on the certificate used to authenticate a request*Shown as second*                                                                      |
| **kubernetes.rest.client.requests**(gauge)                          | The number of HTTP requests*Shown as operation*                                                                                                                       |
| **kubernetes.rest.client.latency.count**(gauge)                     | The count of request latency in seconds broken down by verb and URL                                                                                                   |
| **kubernetes.rest.client.latency.sum**(gauge)                       | The sum of request latency in seconds broken down by verb and URL*Shown as second*                                                                                    |
| **kubernetes.kubelet.pleg.discard\_events**(count)                  | The number of discard events in PLEG                                                                                                                                  |
| **kubernetes.kubelet.pleg.last\_seen**(gauge)                       | Timestamp in seconds when PLEG was last seen active*Shown as second*                                                                                                  |
| **kubernetes.kubelet.pleg.relist\_duration.count**(gauge)           | The count of relisting pods in PLEG                                                                                                                                   |
| **kubernetes.kubelet.pleg.relist\_duration.sum**(gauge)             | The sum of duration in seconds for relisting pods in PLEG*Shown as second*                                                                                            |
| **kubernetes.kubelet.pleg.relist\_interval.count**(gauge)           | The count of relisting pods in PLEG*Shown as second*                                                                                                                  |
| **kubernetes.kubelet.pleg.relist\_interval.sum**(gauge)             | The sum of interval in seconds between relisting in PLEG                                                                                                              |
| **kubernetes.kubelet.runtime.operations**(count)                    | The number of runtime operations*Shown as operation*                                                                                                                  |
| **kubernetes.kubelet.runtime.errors**(gauge)                        | Cumulative number of runtime operations errors*Shown as operation*                                                                                                    |
| **kubernetes.kubelet.runtime.operations.duration.sum**(gauge)       | The sum of duration of operations*Shown as operation*                                                                                                                 |
| **kubernetes.kubelet.runtime.operations.duration.count**(gauge)     | The count of operations                                                                                                                                               |
| **kubernetes.kubelet.network\_plugin.latency.sum**(gauge)           | The sum of latency in microseconds of network plugin operations*Shown as microsecond*                                                                                 |
| **kubernetes.kubelet.network\_plugin.latency.count**(gauge)         | The count of network plugin operations by latency                                                                                                                     |
| **kubernetes.kubelet.network\_plugin.latency.quantile**(gauge)      | The quantiles of network plugin operations by latency                                                                                                                 |
| **kubernetes.kubelet.volume.stats.available\_bytes**(gauge)         | The number of available bytes in the volume*Shown as byte*                                                                                                            |
| **kubernetes.kubelet.volume.stats.capacity\_bytes**(gauge)          | The capacity in bytes of the volume*Shown as byte*                                                                                                                    |
| **kubernetes.kubelet.volume.stats.used\_bytes**(gauge)              | The number of used bytes in the volume*Shown as byte*                                                                                                                 |
| **kubernetes.kubelet.volume.stats.inodes**(gauge)                   | The maximum number of inodes in the volume*Shown as inode*                                                                                                            |
| **kubernetes.kubelet.volume.stats.inodes\_free**(gauge)             | The number of free inodes in the volume*Shown as inode*                                                                                                               |
| **kubernetes.kubelet.volume.stats.inodes\_used**(gauge)             | The number of used inodes in the volume*Shown as inode*                                                                                                               |
| **kubernetes.ephemeral\_storage.limits**(gauge)                     | Ephemeral storage limit of the container (requires kubernetes v1.8+)*Shown as byte*                                                                                   |
| **kubernetes.ephemeral\_storage.requests**(gauge)                   | Ephemeral storage request of the container (requires kubernetes v1.8+)*Shown as byte*                                                                                 |
| **kubernetes.ephemeral\_storage.usage**(gauge)                      | Ephemeral storage usage of the POD*Shown as byte*                                                                                                                     |
| **kubernetes.kubelet.evictions**(count)                             | The number of pods that have been evicted from the kubelet (ALPHA in kubernetes v1.16)                                                                                |
| **kubernetes.kubelet.cpu.usage**(gauge)                             | The number of cores used by kubelet*Shown as nanocore*                                                                                                                |
| **kubernetes.kubelet.memory.usage**(gauge)                          | Current kubelet memory usage in bytes*Shown as byte*                                                                                                                  |
| **kubernetes.kubelet.memory.rss**(gauge)                            | Size of kubelet RSS in bytes*Shown as byte*                                                                                                                           |
| **kubernetes.runtime.cpu.usage**(gauge)                             | The number of cores used by the runtime*Shown as nanocore*                                                                                                            |
| **kubernetes.runtime.memory.usage**(gauge)                          | Current runtime memory usage in bytes*Shown as byte*                                                                                                                  |
| **kubernetes.runtime.memory.rss**(gauge)                            | Size of runtime RSS in bytes*Shown as byte*                                                                                                                           |
| **kubernetes.kubelet.container.log\_filesystem.used\_bytes**(gauge) | Bytes used by the container's logs on the filesystem (requires kubernetes 1.14+)*Shown as byte*                                                                       |
| **kubernetes.kubelet.pod.start.duration**(gauge)                    | Duration in microseconds for a single pod to go from pending to running*Shown as microsecond*                                                                         |
| **kubernetes.kubelet.pod.worker.duration**(gauge)                   | Duration in microseconds to sync a single pod. Broken down by operation type: create, update, or sync*Shown as microsecond*                                           |
| **kubernetes.kubelet.pod.worker.start.duration**(gauge)             | Duration in microseconds from seeing a pod to starting a worker*Shown as microsecond*                                                                                 |
| **kubernetes.kubelet.docker.operations**(count)                     | The number of docker operations*Shown as operation*                                                                                                                   |
| **kubernetes.kubelet.docker.errors**(count)                         | The number of docker operations errors*Shown as operation*                                                                                                            |
| **kubernetes.kubelet.docker.operations.duration.sum**(gauge)        | The sum of duration of docker operations*Shown as operation*                                                                                                          |
| **kubernetes.kubelet.docker.operations.duration.count**(gauge)      | The count of docker operations                                                                                                                                        |
| **kubernetes.go\_threads**(gauge)                                   | Number of OS threads created                                                                                                                                          |
| **kubernetes.go\_goroutines**(gauge)                                | Number of goroutines that currently exist                                                                                                                             |
| **kubernetes.liveness\_probe.success.total**(gauge)                 | Cumulative number of successful liveness probe for a container (ALPHA in kubernetes v1.15)                                                                            |
| **kubernetes.liveness\_probe.failure.total**(gauge)                 | Cumulative number of failed liveness probe for a container (ALPHA in kubernetes v1.15)                                                                                |
| **kubernetes.readiness\_probe.success.total**(gauge)                | Cumulative number of successful readiness probe for a container (ALPHA in kubernetes v1.15)                                                                           |
| **kubernetes.readiness\_probe.failure.total**(gauge)                | Cumulative number of failed readiness probe for a container (ALPHA in kubernetes v1.15)                                                                               |
| **kubernetes.startup\_probe.success.total**(gauge)                  | Cumulative number of successful startup probe for a container (ALPHA in kubernetes v1.15)                                                                             |
| **kubernetes.startup\_probe.failure.total**(gauge)                  | Cumulative number of failed startup probe for a container (ALPHA in kubernetes v1.15)                                                                                 |
| **kubernetes.node.filesystem.usage**(gauge)                         | The amount of disk used at node level*Shown as byte*                                                                                                                  |
| **kubernetes.node.filesystem.usage\_pct**(gauge)                    | The percentage of disk space used at node level*Shown as fraction*                                                                                                    |
| **kubernetes.node.image.filesystem.usage**(gauge)                   | The amount of disk used on image filesystem (node level)*Shown as byte*                                                                                               |
| **kubernetes.node.image.filesystem.usage\_pct**(gauge)              | The percentage of disk used (node level)*Shown as fraction*                                                                                                           |
| **kubernetes.pod.terminating.duration**(gauge)                      | Amount of time the pod hangs in termination phase*Shown as second*                                                                                                    |
| **kubernetes.pod.resize.pending**(gauge)                            | Number of pods with resource resize request in pending state                                                                                                          |

### Service Checks{% #service-checks %}

**kubernetes.kubelet.check.ping**

Returns `CRITICAL` if the Kubelet doesn't respond to Ping. OK, otherwise

*Statuses: ok, critical*

**kubernetes.kubelet.check.docker**

Returns `CRITICAL` if the Docker service doesn't run on the Kubelet. OK, otherwise

*Statuses: ok, critical*

**kubernetes.kubelet.check.syncloop**

Returns `CRITICAL` if the syncloop health check is down. OK, otherwise

*Statuses: ok, critical*

**kubernetes.kubelet.check**

Returns `CRITICAL` if the overall Kubelet health check is down. OK, otherwise

*Statuses: ok, critical*

### Excluded containers{% #excluded-containers %}

To restrict the data collected to a subset of the containers deployed, set the [`DD_CONTAINER_EXCLUDE` environment variable](https://docs.datadoghq.com/agent/guide/autodiscovery-management.md?tab=containerizedagent). Metrics are not included from the containers specified in that environment variable.

For network metrics reported at the pod level, containers cannot be excluded based on `name` or `image name` since other containers can be part of the same pod. So, if `DD_CONTAINER_EXCLUDE` applies to a namespace, the pod-level metrics are not reported if the pod is in that namespace. However, if `DD_CONTAINER_EXCLUDE` refers to a container name or image name, the pod-level metrics are reported even if the exclusion rules apply to some containers in the pod.

## Troubleshooting{% #troubleshooting %}

Need help? Contact [Datadog support](https://docs.datadoghq.com/help/).
