Kubernetes Scheduler

Supported OS Linux Windows

Integrationv4.5.0

Kube Scheduler ダッシュボード

概要

このチェックは、Kubernetes Control Plane の一部である Kubernetes Scheduler を監視します。

: サービスが公開されていないため、このチェックは Amazon EKS クラスターのデータを収集しません。

セットアップ

インストール

Kubernetes Scheduler チェックは Datadog Agent パッケージに含まれています。 サーバーに追加でインストールする必要はありません。

コンフィギュレーション

オートディスカバリーのインテグレーションテンプレートのガイドを参照して、次のパラメーターを適用してください。

メトリクスの収集

  1. kube_scheduler のパフォーマンスデータの収集を開始するには、Agent の構成ディレクトリのルートにある conf.d/ フォルダーの kube_scheduler.d/conf.yaml ファイルを編集します。使用可能なすべての構成オプションの詳細については、サンプル kube_scheduler.d/conf.yaml を参照してください。

  2. Agent を再起動します

ログの収集

Datadog Agent で、ログの収集はデフォルトで無効になっています。有効にする方法については、Kubernetes ログ収集を参照してください。

パラメーター
<LOG_CONFIG>{"source": "kube_scheduler", "service": "<サービス名>"}

検証

Agent の status サブコマンドを実行し、Checks セクションで kube_scheduler を探します。

収集データ

メトリクス

kube_scheduler.binding_duration.count
(gauge)
Number of latency in seconds
kube_scheduler.binding_duration.sum
(gauge)
Total binding latency in seconds
kube_scheduler.cache.lookups
(count)
Number of equivalence cache lookups, by whether or not a cache entry was found
kube_scheduler.client.http.requests
(count)
Number of HTTP requests, partitioned by status code, method, and host
kube_scheduler.client.http.requests_duration.count
(gauge)
Number of client requests. Broken down by verb and URL
kube_scheduler.client.http.requests_duration.sum
(gauge)
Total latency. Broken down by verb and URL
kube_scheduler.gc_duration_seconds.count
(gauge)
Number of the GC invocation
kube_scheduler.gc_duration_seconds.quantile
(gauge)
GC invocation durations quantiles
kube_scheduler.gc_duration_seconds.sum
(gauge)
GC invocation durations sum
kube_scheduler.goroutines
(gauge)
Number of goroutines that currently exist
kube_scheduler.max_fds
(gauge)
Maximum allowed open file descriptors
kube_scheduler.open_fds
(gauge)
Number of open file descriptors
kube_scheduler.pod_preemption.victims.count
(gauge)
Number of selected pods during the latest preemption round
kube_scheduler.pod_preemption.victims.sum
(gauge)
Total selected pods during the latest preemption round
kube_scheduler.pod_preemption.attempts
(count)
Number of preemption attempts in the cluster till now
kube_scheduler.schedule_attempts
(gauge)
Number of attempts to schedule pods, by the result. 'unschedulable' means a pod could not be scheduled, while 'error' means an internal scheduler problem.
kube_scheduler.scheduling.algorithm_duration.count
(gauge)
Number of scheduling algorithm latency
kube_scheduler.scheduling.algorithm_duration.sum
(gauge)
Total scheduling algorithm latency
kube_scheduler.scheduling.algorithm.predicate_duration.count
(gauge)
Number of scheduling algorithm predicate evaluation
kube_scheduler.scheduling.algorithm.predicate_duration.sum
(gauge)
Total scheduling algorithm predicate evaluation duration
kube_scheduler.scheduling.algorithm.preemption_duration.count
(gauge)
Number of scheduling algorithm preemption evaluation
kube_scheduler.scheduling.algorithm.preemption_duration.sum
(gauge)
Total scheduling algorithm preemption evaluation duration
kube_scheduler.scheduling.algorithm.priority_duration.count
(gauge)
Number of scheduling algorithm priority evaluation
kube_scheduler.scheduling.algorithm.priority_duration.sum
(gauge)
Total scheduling algorithm priority evaluation duration
kube_scheduler.scheduling.e2e_scheduling_duration.count
(gauge)
Number of E2e scheduling latency (scheduling algorithm + binding)
kube_scheduler.scheduling.e2e_scheduling_duration.sum
(gauge)
Total E2e scheduling latency (scheduling algorithm + binding)
kube_scheduler.scheduling.scheduling_duration.count
(gauge)
Number of scheduling split by sub-parts of the scheduling operation
kube_scheduler.scheduling.scheduling_duration.quantile
(gauge)
Scheduling latency quantiles split by sub-parts of the scheduling operation
kube_scheduler.scheduling.scheduling_duration.sum
(gauge)
Total scheduling latency split by sub-parts of the scheduling operation
kube_scheduler.threads
(gauge)
Number of OS threads created
kube_scheduler.volume_scheduling_duration.count
(gauge)
Number of Volume scheduling
kube_scheduler.volume_scheduling_duration.sum
(gauge)
Total Volume scheduling stage latency
kube_scheduler.scheduling.pod.scheduling_attempts.sum
(gauge)
Total number of attempts to successfully schedule a pod (requires k8s v1.23+)
kube_scheduler.scheduling.pod.scheduling_attempts.count
(gauge)
Number of attempts to successfully schedule a pod (requires k8s v1.23+)
kube_scheduler.scheduling.pod.scheduling_duration.sum
(gauge)
Total e2e latency for a pod being scheduled which may include multiple scheduling attempts (requires k8s v1.23+)
kube_scheduler.scheduling.pod.scheduling_duration.count
(gauge)
E2e latency for a pod being scheduled which may include multiple scheduling attempts (requires k8s v1.23+)
kube_scheduler.scheduling.attempt_duration.sum
(gauge)
Total scheduling attempt latency in seconds (scheduling algorithm + binding) (requires k8s v1.23+)
kube_scheduler.scheduling.attempt_duration.count
(gauge)
Scheduling attempt latency in seconds (scheduling algorithm + binding) (requires k8s v1.23+)
kube_scheduler.pending_pods
(gauge)
Number of pending pods, by the queue type (requires k8s v1.15+)
kube_scheduler.queue.incoming_pods
(count)
Number of pods added to scheduling queues by event and queue type (requires k8s v1.17+)

イベント

Kube Scheduler には、イベントは含まれません。

サービスのチェック

kube_scheduler.prometheus.health
Returns CRITICAL if the check cannot access the metrics endpoint.
Statuses: ok, critical

kube_scheduler.leader_election.status
Returns CRITICAL if no replica is currently set as leader.
Statuses: ok, critical

kube_scheduler.up
Returns CRITICAL if Kube Scheduler is not healthy.
Statuses: ok, critical

トラブルシューティング

ご不明な点は、Datadog のサポートチームまでお問合せください。