New announcements for Serverless, Network, RUM, and more from Dash! New announcements from Dash!

Kubernetes Controller Manager

Agent Check Agent Check

Supported OS: Linux Mac OS Windows

Overview

This check monitors the Kubernetes Controller Manager, part of the Kubernetes control plane.

Setup

Installation

The Kube_controller_manager check is included in the Datadog Agent package, so you do not need to install anything else on your server.

Configuration

This integration requires access to the controller manager’s metric endpoint. It is usually not exposed in Container-as-a-Service clusters.

  1. Edit the kube_controller_manager.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your kube_controller_manager performance data. See the sample kube_controller_manager.d/conf.yaml for all available configuration options.

  2. Restart the Agent

Validation

Run the Agent’s status subcommand and look for kube_controller_manager under the Checks section.

Data Collected

Metrics

kube_controller_manager.queue.latency.count
(gauge)
Processing latency count, by queue (deprecated in kubernetes v1.14)
kube_controller_manager.queue.latency.sum
(gauge)
Processing latency sum, by queue (deprecated in kubernetes v1.14)
Shown as microsecond
kube_controller_manager.queue.latency.quantile
(gauge)
Processing latency quantiles, by queue (deprecated in kubernetes v1.14)
Shown as microsecond
kube_controller_manager.queue.work_duration.count
(gauge)
Work duration, by queue (deprecated in kubernetes v1.14)
kube_controller_manager.queue.work_duration.sum
(gauge)
Work duration sum, by queue (deprecated in kubernetes v1.14)
Shown as microsecond
kube_controller_manager.queue.work_duration.quantile
(gauge)
Work duration quantiles, by queue (deprecated in kubernetes v1.14)
Shown as microsecond
kube_controller_manager.queue.depth
(gauge)
Current depth, by queue
kube_controller_manager.queue.adds
(count)
Elements added, by queue
kube_controller_manager.queue.retries
(count)
Retries handled, by queue
kube_controller_manager.rate_limiter.use
(gauge)
Usage of the rate limiter, by limiter
kube_controller_manager.goroutines
(gauge)
Number of goroutines that currently exist
kube_controller_manager.threads
(gauge)
Number of OS threads created
kube_controller_manager.open_fds
(gauge)
Number of open file descriptors
kube_controller_manager.max_fds
(gauge)
Maximum allowed open file descriptors
kube_controller_manager.nodes.evictions
(count)
Count of node eviction events, per zone
kube_controller_manager.nodes.count
(gauge)
Number of registered nodes, per zone
kube_controller_manager.nodes.unhealthy
(gauge)
Number of unhealthy nodes, per zone
kube_controller_manager.leader_election.transtions
(count)
Number of leadership transitions observed
kube_controller_manager.leader_election.lease_duration
(gauge)
Duration of the leadership lease
kube_controller_manager.queue.process_duration.count
(gauge)
How long processing an item from workqueue takes, by queue
kube_controller_manager.queue.process_duration.sum
(gauge)
Total workqueue processing time, by queue
Shown as second
kube_controller_manager.queue.work_longest_duration
(gauge)
How many seconds has the longest running processor been running, by queue
Shown as second
kube_controller_manager.queue.work_unfinished_duration
(gauge)
How many seconds of work has done that is in progress and hasn't been observed by process_duration, by queue
Shown as second
kube_controller_manager.queue.queue_duration.count
(gauge)
How long item stays in a queue before being requested, by queue
kube_controller_manager.queue.queue_duration.sum
(gauge)
Total time of items stays in a queue before being requested, by queue
Shown as second

Service Checks

kube_controller_manager.prometheus.health:

Returns CRITICAL if the Agent cannot reach the metrics endpoints.

Events

Kube_controller_manager does not include any events.

Troubleshooting

Need help? Contact Datadog Support.


Mistake in the docs? Feel free to contribute!