Kubernetes Controller Manager

Docs > インテグレーション > Kubernetes Controller Manager

Supported OS Linux Windows Mac OS

インテグレーションバージョン7.1.0

Kube Controller Manager ダッシュボード

概要

このチェックは、Kubernetes Control Plane の一部である Kubernetes Controller Manager を監視します。

注: サービスが公開されていないため、このチェックは Amazon EKS クラスターのデータを収集しません。

セットアップ

インストール

Kubernetes Controller Manager チェックは Datadog Agent パッケージに含まれているため、サーバーに追加でインストールする必要はありません。

構成

kube_controller_manager のパフォーマンスデータの収集を開始するには、Agent のコンフィギュレーションディレクトリのルートにある conf.d/ フォルダーの kube_controller_manager.d/conf.yaml ファイルを編集します。使用可能なすべてのコンフィギュレーションオプションについては、サンプル kube_controller_manager.d/conf.yam を参照してください。
Agent を再起動します。

このインテグレーションは、コントローラーマネージャーのメトリクスエンドポイントにアクセスする必要があります。メトリクスエンドポイントにアクセスできるようにするには、以下が必要です。

controller-manager プロセスの IP/Port にアクセスできる
get RBAC 権限で /metrics エンドポイントにアクセスできる (デフォルトの Datadog Helm チャートでは、すでに適切な RBAC ロールとバインディングが追加されています)

検証

Agent の status サブコマンドを実行し、Checks セクションで kube_controller_manager を探します。

収集データ

メトリクス


kube_controller_manager.goroutines (gauge)	Number of goroutines that currently exist
kube_controller_manager.job_controller.terminated_pods_tracking_finalizer (count)	Used to monitor whether the job controller is removing Pod finalizers from terminated Pods after accounting them in Job status
kube_controller_manager.leader_election.lease_duration (gauge)	Duration of the leadership lease
kube_controller_manager.leader_election.transitions (count)	Number of leadership transitions observed
kube_controller_manager.max_fds (gauge)	Maximum allowed open file descriptors
kube_controller_manager.nodes.count (gauge)	Number of registered nodes, per zone
kube_controller_manager.nodes.evictions (count)	Count of node eviction events, per zone
kube_controller_manager.nodes.unhealthy (gauge)	Number of unhealthy nodes, per zone
kube_controller_manager.open_fds (gauge)	Number of open file descriptors
kube_controller_manager.queue.adds (count)	Elements added, by queue
kube_controller_manager.queue.depth (gauge)	Current depth, by queue
kube_controller_manager.queue.latency.count (gauge)	Processing latency count, by queue (deprecated in kubernetes v1.14)
kube_controller_manager.queue.latency.quantile (gauge)	Processing latency quantiles, by queue (deprecated in kubernetes v1.14) Shown as microsecond
kube_controller_manager.queue.latency.sum (gauge)	Processing latency sum, by queue (deprecated in kubernetes v1.14) Shown as microsecond
kube_controller_manager.queue.process_duration.count (gauge)	How long processing an item from workqueue takes, by queue
kube_controller_manager.queue.process_duration.sum (gauge)	Total workqueue processing time, by queue Shown as second
kube_controller_manager.queue.queue_duration.count (gauge)	How long item stays in a queue before being requested, by queue
kube_controller_manager.queue.queue_duration.sum (gauge)	Total time of items stays in a queue before being requested, by queue Shown as second
kube_controller_manager.queue.retries (count)	Retries handled, by queue
kube_controller_manager.queue.work_duration.count (gauge)	Work duration, by queue (deprecated in kubernetes v1.14)
kube_controller_manager.queue.work_duration.quantile (gauge)	Work duration quantiles, by queue (deprecated in kubernetes v1.14) Shown as microsecond
kube_controller_manager.queue.work_duration.sum (gauge)	Work duration sum, by queue (deprecated in kubernetes v1.14) Shown as microsecond
kube_controller_manager.queue.work_longest_duration (gauge)	How many seconds has the longest running processor been running, by queue Shown as second
kube_controller_manager.queue.work_unfinished_duration (gauge)	How many seconds of work has done that is in progress and hasn’t been observed by process_duration, by queue Shown as second
kube_controller_manager.rate_limiter.use (gauge)	Usage of the rate limiter, by limiter
kube_controller_manager.slis.kubernetes_healthcheck (gauge)	Result of a single controller manager healthcheck (alpha; requires k8s v1.26+)
kube_controller_manager.slis.kubernetes_healthcheck_total (count)	Cumulative results of all controller manager healthchecks (alpha; requires k8s v1.26+)
kube_controller_manager.threads (gauge)	Number of OS threads created

イベント

Kubernetes Controller Manager チェックには、イベントは含まれません。

サービスチェック

kube_controller_manager.prometheus.health

Returns CRITICAL if the check cannot access the metrics endpoint.

Statuses: ok, critical

kube_controller_manager.leader_election.status

Returns CRITICAL if no replica is currently set as leader.

Statuses: ok, critical

kube_controller_manager.up

Returns CRITICAL if Kube Controller Manager is not healthy.

Statuses: ok, critical

トラブルシューティング

ご不明な点は、Datadog のサポートチームまでお問い合わせください。