New announcements for Serverless, Network, RUM, and more from Dash! New announcements from Dash!

Kubernetes Scheduler

Agent Check Agent Check

Supported OS: Linux Mac OS Windows

Overview

This check monitors Kubernetes Scheduler, part of the Kubernetes control plane.

Setup

Installation

The Kubernetes Scheduler check is included in the Datadog Agent package. No additional installation is needed on your server.

Configuration

Metric collection

  1. Edit the kube_scheduler.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your kube_scheduler performance data. See the sample kube_scheduler.d/conf.yaml for all available configuration options.

  2. Restart the Agent.

Log collection

Available for Agent >6.0

  1. Collecting logs is disabled by default in the Datadog Agent. Enable it in your daemonset configuration:

    (...)
      env:
        (...)
        - name: DD_LOGS_ENABLED
            value: "true"
        - name: DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL
            value: "true"
    (...)
    
  2. Make sure that the Docker socket is mounted to the Datadog Agent as done in this manifest.

  3. Restart the Agent.

Validation

Run the Agent’s status subcommand and look for kube_scheduler under the Checks section.

Data Collected

Metrics

kube_scheduler.binding_duration.count
(gauge)
Number of latency in seconds
kube_scheduler.binding_duration.sum
(gauge)
Total binding latency in seconds
kube_scheduler.cache.lookups
(count)
Number of equivalence cache lookups, by whether or not a cache entry was found
kube_scheduler.client.http.requests
(count)
Number of HTTP requests, partitioned by status code, method, and host
kube_scheduler.client.http.requests_duration.count
(gauge)
Number of client requests. Broken down by verb and URL
kube_scheduler.client.http.requests_duration.sum
(gauge)
Total latency. Broken down by verb and URL
kube_scheduler.gc_duration_seconds.count
(gauge)
Number of the GC invocation
kube_scheduler.gc_duration_seconds.quantile
(gauge)
GC invocation durations quantiles
kube_scheduler.gc_duration_seconds.sum
(gauge)
GC invocation durations sum
kube_scheduler.goroutines
(gauge)
Number of goroutines that currently exist
kube_scheduler.max_fds
(gauge)
Maximum allowed open file descriptors
kube_scheduler.open_fds
(gauge)
Number of open file descriptors
kube_scheduler.pod_preemption.victims
(gauge)
Number of selected pods during the latest preemption round
kube_scheduler.pod_preemption.attempts
(counter)
Number of preemption attempts in the cluster till now
kube_scheduler.schedule_attempts
(gauge)
Number of attempts to schedule pods, by the result. 'unschedulable' means a pod could not be scheduled, while 'error' means an internal scheduler problem.
kube_scheduler.scheduling.algorithm_duration.count
(gauge)
Number of scheduling algorithm latency
kube_scheduler.scheduling.algorithm_duration.sum
(gauge)
Total scheduling algorithm latency
kube_scheduler.scheduling.algorithm.predicate_duration.count
(gauge)
Number of scheduling algorithm predicate evaluation
kube_scheduler.scheduling.algorithm.predicate_duration.sum
(gauge)
Total scheduling algorithm predicate evaluation duration
kube_scheduler.scheduling.algorithm.preemption_duration.count
(gauge)
Number of scheduling algorithm preemption evaluation
kube_scheduler.scheduling.algorithm.preemption_duration.sum
(gauge)
Total scheduling algorithm preemption evaluation duration
kube_scheduler.scheduling.algorithm.priority_duration.count
(gauge)
Number of scheduling algorithm priority evaluation
kube_scheduler.scheduling.algorithm.priority_duration.sum
(gauge)
Total scheduling algorithm priority evaluation duration
kube_scheduler.scheduling.e2e_scheduling_duration.count
(gauge)
Number of E2e scheduling latency (scheduling algorithm + binding)
kube_scheduler.scheduling.e2e_scheduling_duration.sum
(gauge)
Total E2e scheduling latency (scheduling algorithm + binding)
kube_scheduler.scheduling.scheduling_duration.count
(gauge)
Number of scheduling split by sub-parts of the scheduling operation
kube_scheduler.scheduling.scheduling_duration.quantile
(gauge)
Scheduling latency quantiles split by sub-parts of the scheduling operation
kube_scheduler.scheduling.scheduling_duration.sum
(gauge)
Total scheduling latency split by sub-parts of the scheduling operation
kube_scheduler.threads
(gauge)
Number of OS threads created
kube_scheduler.volume_scheduling_duration.count
(gauge)
Number of Volume scheduling
kube_scheduler.volume_scheduling_duration.sum
(gauge)
Total Volume scheduling stage latency

Service Checks

kube_scheduler.prometheus.health:
Returns CRITICAL if the Agent cannot reach the metrics endpoints, otherwise returns OK.

Events

Kube Scheduler does not include any events.

Troubleshooting

Need help? Contact Datadog support.


Mistake in the docs? Feel free to contribute!