- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
This check monitors the Kubernetes Controller Manager, part of the Kubernetes control plane.
Note: This check does not collect data for Amazon EKS clusters, as those services are not exposed.
The Kubernetes Controller Manager check is included in the Datadog Agent package, so you do not need to install anything else on your server.
Edit the kube_controller_manager.d/conf.yaml
file, in the conf.d/
folder at the root of your Agent’s configuration directory to start collecting your kube_controller_manager performance data. See the sample kube_controller_manager.d/conf.yaml for all available configuration options.
This integration requires access to the controller manager’s metric endpoint. To have access to the metric endpoint you should:
get
RBAC permissions to the /metrics endpoint (the default Datadog Helm chart already adds the right RBAC roles and bindings for this)Run the Agent’s status
subcommand and look for kube_controller_manager
under the Checks section.
kube_controller_manager.goroutines (gauge) | Number of goroutines that currently exist |
kube_controller_manager.job_controller.terminated_pods_tracking_finalizer (count) | Used to monitor whether the job controller is removing Pod finalizers from terminated Pods after accounting them in Job status |
kube_controller_manager.leader_election.lease_duration (gauge) | Duration of the leadership lease |
kube_controller_manager.leader_election.transitions (count) | Number of leadership transitions observed |
kube_controller_manager.max_fds (gauge) | Maximum allowed open file descriptors |
kube_controller_manager.nodes.count (gauge) | Number of registered nodes, per zone |
kube_controller_manager.nodes.evictions (count) | Count of node eviction events, per zone |
kube_controller_manager.nodes.unhealthy (gauge) | Number of unhealthy nodes, per zone |
kube_controller_manager.open_fds (gauge) | Number of open file descriptors |
kube_controller_manager.queue.adds (count) | Elements added, by queue |
kube_controller_manager.queue.depth (gauge) | Current depth, by queue |
kube_controller_manager.queue.latency.count (gauge) | Processing latency count, by queue (deprecated in kubernetes v1.14) |
kube_controller_manager.queue.latency.quantile (gauge) | Processing latency quantiles, by queue (deprecated in kubernetes v1.14) Shown as microsecond |
kube_controller_manager.queue.latency.sum (gauge) | Processing latency sum, by queue (deprecated in kubernetes v1.14) Shown as microsecond |
kube_controller_manager.queue.process_duration.count (gauge) | How long processing an item from workqueue takes, by queue |
kube_controller_manager.queue.process_duration.sum (gauge) | Total workqueue processing time, by queue Shown as second |
kube_controller_manager.queue.queue_duration.count (gauge) | How long item stays in a queue before being requested, by queue |
kube_controller_manager.queue.queue_duration.sum (gauge) | Total time of items stays in a queue before being requested, by queue Shown as second |
kube_controller_manager.queue.retries (count) | Retries handled, by queue |
kube_controller_manager.queue.work_duration.count (gauge) | Work duration, by queue (deprecated in kubernetes v1.14) |
kube_controller_manager.queue.work_duration.quantile (gauge) | Work duration quantiles, by queue (deprecated in kubernetes v1.14) Shown as microsecond |
kube_controller_manager.queue.work_duration.sum (gauge) | Work duration sum, by queue (deprecated in kubernetes v1.14) Shown as microsecond |
kube_controller_manager.queue.work_longest_duration (gauge) | How many seconds has the longest running processor been running, by queue Shown as second |
kube_controller_manager.queue.work_unfinished_duration (gauge) | How many seconds of work has done that is in progress and hasn't been observed by process_duration, by queue Shown as second |
kube_controller_manager.rate_limiter.use (gauge) | Usage of the rate limiter, by limiter |
kube_controller_manager.slis.kubernetes_healthcheck (gauge) | Result of a single controller manager healthcheck (alpha; requires k8s v1.26+) |
kube_controller_manager.slis.kubernetes_healthcheck_total (count) | Cumulative results of all controller manager healthchecks (alpha; requires k8s v1.26+) |
kube_controller_manager.threads (gauge) | Number of OS threads created |
The Kubernetes Controller Manager check does not include any events.
kube_controller_manager.prometheus.health
Returns CRITICAL
if the check cannot access the metrics endpoint.
Statuses: ok, critical
kube_controller_manager.leader_election.status
Returns CRITICAL
if no replica is currently set as leader.
Statuses: ok, critical
kube_controller_manager.up
Returns CRITICAL
if Kube Controller Manager is not healthy.
Statuses: ok, critical
Need help? Contact Datadog Support.