Network Performance Monitoring is now generally available! Network Monitoring is now available!

Kubernetes Metrics Server

Agent Check Agent Check

Supported OS: Linux Mac OS Windows

Overview

This check monitors Kube_metrics_server v0.3.0+, a component used by the Kubernetes control plane.

Setup

Installation

The Kube_metrics_server check is included in the Datadog Agent package. No additional installation is needed on your server.

Configuration

  1. Edit the kube_metrics_server.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your kube_metrics_server performance data. See the sample kube_metrics_server.d/conf.yaml for all available configuration options.

  2. Restart the Agent.

SSL

If your endpoint is secured, additional configuration is required:

  1. Identify the certificate used for securing the metric endpoint.

  2. Mount the related certificate file in the Agent pod.

  3. Apply your SSL configuration. Refer to the default configuration file for more information.

Validation

Run the Agent’s status subcommand and look for kube_metrics_server under the Checks section.

Data Collected

Metrics

kube_metrics_server.authenticated_user.requests
(counter)
Counter of authenticated requests broken out by username
kube_metrics_server.go.gc_duration_seconds.quantile
(gauge)
GC invocation durations quantiles
kube_metrics_server.go.gc_duration_seconds.sum
(gauge)
GC invocation durations sum
kube_metrics_server.go.gc_duration_seconds.count
(gauge)
Number of the GC invocation
kube_metrics_server.go.goroutines
(gauge)
Number of goroutines that currently exist
kube_metrics_server.kubelet_summary_request_duration.count
(gauge)
Number of Kubelet summary request
kube_metrics_server.kubelet_summary_request_duration.sum
(gauge)
The Kubelet summary request latencies sum
kube_metrics_server.kubelet_summary_scrapes_total
(counter)
Total number of attempted Summary API scrapes done by Metrics Server
kube_metrics_server.manager_tick_duration.count
(gauge)
The total time spent collecting and storing metrics
kube_metrics_server.manager_tick_duration.sum
(gauge)
The total time spent collecting and storing metrics
kube_metrics_server.scraper_duration.count
(gauge)
Time spent scraping sources
kube_metrics_server.scraper_duration.sum
(gauge)
Time spent scraping sources
kube_metrics_server.scraper_last_time
(gauge)
Last time metrics-server performed a scrape since unix epoch
kube_metrics_server.process.cpu_seconds_total
(counter)
Total user and system CPU time spent
kube_metrics_server.process.max_fds
(gauge)
Maximum number of open file descriptors
kube_metrics_server.process.open_fds
(gauge)
Number of open file descriptors

Service Checks

kube_metrics_server.prometheus.health:

Returns CRITICAL if the Agent cannot reach the metrics endpoints.

Events

kube_metrics_server does not include any events.

Troubleshooting

Need help? Contact Datadog support.


Mistake in the docs? Feel free to contribute!