Kubernetes Controller Manager

Supported OS Linux Mac OS Windows

Présentation

Ce check surveille Kubernetes Controller Manager qui fait partie du plan de contrôle de Kubernetes.

Remarque : ce check ne recueille pas de données pour les clusters Amazon EKS, car ces services ne sont pas exposés.

Configuration

Installation

Le check Kubernetes Controller Manager est inclus avec le package de l’Agent Datadog : vous n’avez donc rien d’autre à installer sur votre serveur.

Configuration

Cette intégration nécessite l’accès au endpoint des métriques du gestionnaire de contrôleurs. Elle n’est généralement pas exposée dans les clusters Container-as-a-Service.

  1. Modifiez le fichier kube_controller_manager.d/conf.yaml dans le dossier conf.d/ à la racine du répertoire de configuration de votre Agent pour commencer à recueillir vos données de performance kube_controller_manager. Consultez le fichier d’exemple kube_controller_manager.d/conf.yaml pour découvrir toutes les options de configuration disponibles.

  2. Redémarrez l’Agent.

Validation

Lancez la sous-commande status de l’Agent et cherchez kube_controller_manager dans la section Checks.

Données collectées

Métriques

kube_controller_manager.goroutines
(gauge)
Number of goroutines that currently exist
kube_controller_manager.job_controller.terminated_pods_tracking_finalizer
(count)
Used to monitor whether the job controller is removing Pod finalizers from terminated Pods after accounting them in Job status
kube_controller_manager.leader_election.lease_duration
(gauge)
Duration of the leadership lease
kube_controller_manager.leader_election.transitions
(count)
Number of leadership transitions observed
kube_controller_manager.max_fds
(gauge)
Maximum allowed open file descriptors
kube_controller_manager.nodes.count
(gauge)
Number of registered nodes, per zone
kube_controller_manager.nodes.evictions
(count)
Count of node eviction events, per zone
kube_controller_manager.nodes.unhealthy
(gauge)
Number of unhealthy nodes, per zone
kube_controller_manager.open_fds
(gauge)
Number of open file descriptors
kube_controller_manager.queue.adds
(count)
Elements added, by queue
kube_controller_manager.queue.depth
(gauge)
Current depth, by queue
kube_controller_manager.queue.latency.count
(gauge)
Processing latency count, by queue (deprecated in kubernetes v1.14)
kube_controller_manager.queue.latency.quantile
(gauge)
Processing latency quantiles, by queue (deprecated in kubernetes v1.14)
Shown as microsecond
kube_controller_manager.queue.latency.sum
(gauge)
Processing latency sum, by queue (deprecated in kubernetes v1.14)
Shown as microsecond
kube_controller_manager.queue.process_duration.count
(gauge)
How long processing an item from workqueue takes, by queue
kube_controller_manager.queue.process_duration.sum
(gauge)
Total workqueue processing time, by queue
Shown as second
kube_controller_manager.queue.queue_duration.count
(gauge)
How long item stays in a queue before being requested, by queue
kube_controller_manager.queue.queue_duration.sum
(gauge)
Total time of items stays in a queue before being requested, by queue
Shown as second
kube_controller_manager.queue.retries
(count)
Retries handled, by queue
kube_controller_manager.queue.work_duration.count
(gauge)
Work duration, by queue (deprecated in kubernetes v1.14)
kube_controller_manager.queue.work_duration.quantile
(gauge)
Work duration quantiles, by queue (deprecated in kubernetes v1.14)
Shown as microsecond
kube_controller_manager.queue.work_duration.sum
(gauge)
Work duration sum, by queue (deprecated in kubernetes v1.14)
Shown as microsecond
kube_controller_manager.queue.work_longest_duration
(gauge)
How many seconds has the longest running processor been running, by queue
Shown as second
kube_controller_manager.queue.work_unfinished_duration
(gauge)
How many seconds of work has done that is in progress and hasn't been observed by process_duration, by queue
Shown as second
kube_controller_manager.rate_limiter.use
(gauge)
Usage of the rate limiter, by limiter
kube_controller_manager.slis.kubernetes_healthcheck
(gauge)
Result of a single controller manager healthcheck (alpha; requires k8s v1.26+)
kube_controller_manager.slis.kubernetes_healthcheck_total
(count)
Cumulative results of all controller manager healthchecks (alpha; requires k8s v1.26+)
kube_controller_manager.threads
(gauge)
Number of OS threads created

Checks de service

kube_controller_manager.prometheus.health :
Renvoie CRITICAL si l’Agent ne parvient pas à se connecter aux endpoints de métriques.

Événements

Le check Kubernetes Controller Manager n’inclut aucun événement.

Dépannage

Besoin d’aide ? Contactez l’assistance Datadog.