Kubernetes クラスターにデプロイされた Agent が収集するメトリクス:
注: Datadog Kubernetes インテグレーションで収集されるメトリクスは、使用中の Kubernetes のバージョンにより異なる場合があります。
kubernetes.cpu.capacity (gauge) | The number of cores in this machine Shown as core |
kubernetes.cpu.usage.total (gauge) | The number of cores used Shown as nanocore |
kubernetes.cpu.limits (gauge) | The limit of cpu cores set Shown as core |
kubernetes.cpu.requests (gauge) | The requested cpu cores Shown as core |
kubernetes.filesystem.usage (gauge) | The amount of disk used Shown as byte |
kubernetes.filesystem.usage_pct (gauge) | The percentage of disk used Shown as fraction |
kubernetes.memory.capacity (gauge) | The amount of memory (in bytes) in this machine Shown as byte |
kubernetes.memory.limits (gauge) | The limit of memory set Shown as byte |
kubernetes.memory.requests (gauge) | The requested memory Shown as byte |
kubernetes.memory.usage (gauge) | The amount of memory used Shown as byte |
kubernetes.network.rx_bytes (gauge) | The amount of bytes per second received Shown as byte |
kubernetes.network.tx_bytes (gauge) | The amount of bytes per second transmitted Shown as byte |
kubernetes.network_errors (gauge) | The amount of network errors per second Shown as error |
kubernetes.diskio.io_service_bytes.stats.total (gauge) | The amount of disk space the container uses. Shown as byte |
kubernetes.containers.last_state.terminated (gauge) | The number of containers that were previously terminated |
kubernetes.pods.running (gauge) | The number of running pods |
kubernetes.pods.expired (gauge) | The number of expired pods the check ignored |
kubernetes.containers.running (gauge) | The number of running containers |
kubernetes.containers.restarts (gauge) | The number of times the container has been restarted |
kubernetes.containers.state.terminated (gauge) | The number of currently terminated containers |
kubernetes.containers.state.waiting (gauge) | The number of currently waiting containers |
kubernetes.cpu.load.10s.avg (gauge) | Container cpu load average over the last 10 seconds |
kubernetes.cpu.system.total (rate) | The number of cores used for system time Shown as core |
kubernetes.cpu.user.total (rate) | The number of cores used for user time Shown as core |
kubernetes.cpu.cfs.periods (rate) | Number of elapsed enforcement period intervals |
kubernetes.cpu.cfs.throttled.periods (rate) | Number of throttled period intervals |
kubernetes.cpu.cfs.throttled.seconds (rate) | Total time duration the container has been throttled |
kubernetes.cpu.capacity (gauge) | The number of cores in this machine (available until kubernetes v1.18) Shown as core |
kubernetes.cpu.usage.total (gauge) | The number of cores used Shown as nanocore |
kubernetes.cpu.limits (gauge) | The limit of cpu cores set Shown as core |
kubernetes.cpu.requests (gauge) | The requested cpu cores Shown as core |
kubernetes.filesystem.usage (gauge) | The amount of disk used Shown as byte |
kubernetes.filesystem.usage_pct (gauge) | The percentage of disk used Shown as fraction |
kubernetes.io.read_bytes (gauge) | The amount of bytes read from the disk Shown as byte |
kubernetes.io.write_bytes (gauge) | The amount of bytes written to the disk Shown as byte |
kubernetes.memory.capacity (gauge) | The amount of memory (in bytes) in this machine (available until kubernetes v1.18) Shown as byte |
kubernetes.memory.limits (gauge) | The limit of memory set Shown as byte |
kubernetes.memory.sw_limit (gauge) | The limit of swap space set Shown as byte |
kubernetes.memory.requests (gauge) | The requested memory Shown as byte |
kubernetes.memory.usage (gauge) | Current memory usage in bytes including all memory regardless of when it was accessed Shown as byte |
kubernetes.memory.working_set (gauge) | Current working set in bytes - this is what the OOM killer is watching for Shown as byte |
kubernetes.memory.cache (gauge) | The amount of memory that is being used to cache data from disk (e.g. memory contents that can be associated precisely with a block on a block device) Shown as byte |
kubernetes.memory.rss (gauge) | Size of RSS in bytes Shown as byte |
kubernetes.memory.swap (gauge) | The amount of swap currently used by by processes in this cgroup Shown as byte |
kubernetes.memory.usage_pct (gauge) | The percentage of memory used Shown as fraction |
kubernetes.memory.sw_in_use (gauge) | The percentage of swap space used Shown as fraction |
kubernetes.network.rx_bytes (gauge) | The amount of bytes per second received Shown as byte |
kubernetes.network.rx_dropped (gauge) | The amount of rx packets dropped per second Shown as packet |
kubernetes.network.rx_errors (gauge) | The amount of rx errors per second Shown as error |
kubernetes.network.tx_bytes (gauge) | The amount of bytes per second transmitted Shown as byte |
kubernetes.network.tx_dropped (gauge) | The amount of tx packets dropped per second Shown as packet |
kubernetes.network.tx_errors (gauge) | The amount of tx errors per second Shown as error |
kubernetes.diskio.io_service_bytes.stats.total (gauge) | The amount of disk space the container uses Shown as byte |
kubernetes.apiserver.certificate.expiration.count (gauge) | The count of remaining lifetime on the certificate used to authenticate a request Shown as second |
kubernetes.apiserver.certificate.expiration.sum (gauge) | The sum of remaining lifetime on the certificate used to authenticate a request Shown as second |
kubernetes.rest.client.requests (gauge) | The number of HTTP requests Shown as operation |
kubernetes.rest.client.latency.count (gauge) | The count of request latency in seconds broken down by verb and URL |
kubernetes.rest.client.latency.sum (gauge) | The sum of request latency in seconds broken down by verb and URL Shown as second |
kubernetes.kubelet.runtime.operations (gauge) | The number of runtime operations Shown as operation |
kubernetes.kubelet.runtime.errors (gauge) | The number of runtime operations errors Shown as operation |
kubernetes.kubelet.network_plugin.latency.sum (gauge) | The sum of latency in microseconds of network plugin operations Shown as microsecond |
kubernetes.kubelet.network_plugin.latency.count (gauge) | The count of network plugin operations by latency |
kubernetes.kubelet.network_plugin.latency.quantile (gauge) | The quantiles of network plugin operations by latency |
kubernetes.kubelet.volume.stats.available_bytes (gauge) | The number of available bytes in the volume Shown as byte |
kubernetes.kubelet.volume.stats.capacity_bytes (gauge) | The capacity in bytes of the volume Shown as byte |
kubernetes.kubelet.volume.stats.used_bytes (gauge) | The number of used bytes in the volume Shown as byte |
kubernetes.kubelet.volume.stats.inodes (gauge) | The maximum number of inodes in the volume Shown as inode |
kubernetes.kubelet.volume.stats.inodes_free (gauge) | The number of free inodes in the volume Shown as inode |
kubernetes.kubelet.volume.stats.inodes_used (gauge) | The number of used inodes in the volume Shown as inode |
kubernetes.ephemeral_storage.limits (gauge) | Ephemeral storage limit of the container (requires kubernetes v1.8+) Shown as byte |
kubernetes.ephemeral_storage.requests (gauge) | Ephemeral storage request of the container (requires kubernetes v1.8+) Shown as byte |
kubernetes.ephemeral_storage.usage (gauge) | Ephemeral storage usage of the POD Shown as byte |
kubernetes.kubelet.evictions (count) | The number of pods that have been evicted from the kubelet (ALPHA in kubernetes v1.16) |
kubernetes.kubelet.cpu.usage (gauge) | The number of cores used by kubelet Shown as nanocore |
kubernetes.kubelet.memory.rss (gauge) | Size of kubelet RSS in bytes Shown as byte |
kubernetes.runtime.cpu.usage (gauge) | The number of cores used by the runtime Shown as nanocore |
kubernetes.runtime.memory.rss (gauge) | Size of runtime RSS in bytes Shown as byte |
kubernetes.kubelet.container.log_filesystem.used_bytes (gauge) | Bytes used by the container's logs on the filesystem (requires kubernetes 1.14+) Shown as byte |
kubernetes_state.*
メトリクスは kube-state-metrics
API から収集されます。
kubernetes_state.container.ready (gauge) | Whether the containers readiness check succeeded |
kubernetes_state.container.running (gauge) | Whether the container is currently in running state |
kubernetes_state.container.terminated (gauge) | Whether the container is currently in terminated state |
kubernetes_state.container.status_report.count.terminated (gauge) | Count of the containers currently reporting a in terminated state with the reason as a tag |
kubernetes_state.container.waiting (gauge) | Whether the container is currently in waiting state |
kubernetes_state.container.status_report.count.waiting (gauge) | Count of the containers currently reporting a in waiting state with the reason as a tag |
kubernetes_state.container.gpu.request (gauge) | The number of requested gpu devices by a container |
kubernetes_state.container.gpu.limit (gauge) | The limit on gpu devices to be used by a container |
kubernetes_state.container.restarts (gauge) | The number of restarts per container |
kubernetes_state.container.cpu_requested (gauge) | The number of requested cpu cores by a container Shown as cpu |
kubernetes_state.container.memory_requested (gauge) | The number of requested memory bytes by a container Shown as byte |
kubernetes_state.container.cpu_limit (gauge) | The limit on cpu cores to be used by a container Shown as cpu |
kubernetes_state.container.memory_limit (gauge) | The limit on memory to be used by a container Shown as byte |
kubernetes_state.daemonset.scheduled (gauge) | The number of nodes running at least one daemon pod and that are supposed to |
kubernetes_state.daemonset.misscheduled (gauge) | The number of nodes running a daemon pod but are not supposed to |
kubernetes_state.daemonset.desired (gauge) | The number of nodes that should be running the daemon pod |
kubernetes_state.daemonset.ready (gauge) | The number of nodes that should be running the daemon pod and have one or more running and ready |
kubernetes_state.daemonset.updated (gauge) | The number of nodes that run the updated daemon pod spec |
kubernetes_state.deployment.count (gauge) | The number of deployments |
kubernetes_state.deployment.replicas (gauge) | The number of replicas per deployment |
kubernetes_state.deployment.replicas_available (gauge) | The number of available replicas per deployment |
kubernetes_state.deployment.replicas_unavailable (gauge) | The number of unavailable replicas per deployment |
kubernetes_state.deployment.replicas_updated (gauge) | The number of updated replicas per deployment |
kubernetes_state.deployment.replicas_desired (gauge) | The number of desired replicas per deployment |
kubernetes_state.deployment.paused (gauge) | Whether a deployment is paused |
kubernetes_state.deployment.rollingupdate.max_unavailable (gauge) | Maximum number of unavailable replicas during a rolling update |
kubernetes_state.endpoint.address_available (gauge) | Number of addresses available in endpoint |
kubernetes_state.endpoint.address_not_ready (gauge) | Number of addresses not ready in endpoint |
kubernetes_state.endpoint.created (gauge) | Unix creation timestamp |
kubernetes_state.job.count (gauge) | The number of jobs |
kubernetes_state.job.failed (count) | Observed number of failed pods in a job |
kubernetes_state.job.succeeded (count) | Observed number of succeeded pods in a job |
kubernetes_state.limitrange.cpu.min (gauge) | Minimum CPU request for this type |
kubernetes_state.limitrange.cpu.max (gauge) | Maximum CPU limit for this type |
kubernetes_state.limitrange.cpu.default (gauge) | Default CPU limit if not specified |
kubernetes_state.limitrange.cpu.default_request (gauge) | Default CPU request if not specified |
kubernetes_state.limitrange.cpu.max_limit_request_ratio (gauge) | Maximum CPU limit / request ratio |
kubernetes_state.limitrange.memory.min (gauge) | Minimum memory request for this type |
kubernetes_state.limitrange.memory.max (gauge) | Maximum memory limit for this type |
kubernetes_state.limitrange.memory.default (gauge) | Default memory limit if not specified |
kubernetes_state.limitrange.memory.default_request (gauge) | Default memory request if not specified |
kubernetes_state.limitrange.memory.max_limit_request_ratio (gauge) | Maximum memory limit / request ratio |
kubernetes_state.node.count (count) | The number of nodes Shown as node |
kubernetes_state.node.cpu_capacity (gauge) | The total CPU resources of the node Shown as cpu |
kubernetes_state.node.memory_capacity (gauge) | The total memory resources of the node Shown as byte |
kubernetes_state.node.pods_capacity (gauge) | The total pod resources of the node |
kubernetes_state.node.gpu.cards_allocatable (gauge) | The GPU resources of a node that are available for scheduling |
kubernetes_state.node.gpu.cards_capacity (gauge) | The total GPU resources of the node |
kubernetes_state.persistentvolumeclaim.status (gauge) | The phase the persistent volume claim is currently in |
kubernetes_state.persistentvolumeclaim.request_storage (gauge) | Storage space request for a given pvc Shown as byte |
kubernetes_state.persistentvolume.by_phase (gauge) | Number of persistent volumes to sum by phase and storageclass |
kubernetes_state.namespace.count (gauge) | The number of namespaces Shown as cpu |
kubernetes_state.node.cpu_allocatable (gauge) | The CPU resources of a node that are available for scheduling Shown as cpu |
kubernetes_state.node.memory_allocatable (gauge) | The memory resources of a node that are available for scheduling Shown as byte |
kubernetes_state.node.pods_allocatable (gauge) | The pod resources of a node that are available for scheduling |
kubernetes_state.node.status (gauge) | Submitted with a value of 1 for each node and tagged either 'status:schedulable' or 'status:unschedulable'; Sum this metric by either status to get the number of nodes in that status. |
kubernetes_state.nodes.by_condition (gauge) | To sum by `condition` and `status` to get number of nodes in a given condition. |
kubernetes_state.hpa.min_replicas (gauge) | Lower limit for the number of pods that can be set by the autoscaler |
kubernetes_state.hpa.max_replicas (gauge) | Upper limit for the number of pods that can be set by the autoscaler |
kubernetes_state.hpa.desired_replicas (gauge) | Desired number of replicas of pods managed by this autoscaler |
kubernetes_state.hpa.condition (gauge) | Observed condition of autoscalers to sum by condition and status |
kubernetes_state.pdb.pods_desired (gauge) | Minimum desired number of healthy pods |
kubernetes_state.pdb.disruptions_allowed (gauge) | Number of pod disruptions that are currently allowed |
kubernetes_state.pdb.pods_healthy (gauge) | Current number of healthy pods |
kubernetes_state.pdb.pods_total (gauge) | Total number of pods counted by this disruption budget |
kubernetes_state.pod.ready (gauge) | In association with the `condition` tag, whether the pod is ready to serve requests, e.g. `condition:true` keeps the pods that are in a ready state |
kubernetes_state.pod.scheduled (gauge) | Reports the status of the scheduling process for the pod with its tags |
kubernetes_state.pod.unschedulable (gauge) | Reports PODs that Kube scheduler cannot schedule on any node |
kubernetes_state.pod.status_phase (gauge) | To sum by `phase` to get number of pods in a given phase, and `namespace` to break this down by namespace |
kubernetes_state.replicaset.count (gauge) | The number of replicasets |
kubernetes_state.replicaset.replicas (gauge) | The number of replicas per ReplicaSet |
kubernetes_state.replicaset.fully_labeled_replicas (gauge) | The number of fully labeled replicas per ReplicaSet |
kubernetes_state.replicaset.replicas_ready (gauge) | The number of ready replicas per ReplicaSet |
kubernetes_state.replicaset.replicas_desired (gauge) | Number of desired pods for a ReplicaSet |
kubernetes_state.replicationcontroller.replicas (gauge) | The number of replicas per ReplicationController |
kubernetes_state.replicationcontroller.fully_labeled_replicas (gauge) | The number of fully labeled replicas per ReplicationController |
kubernetes_state.replicationcontroller.replicas_ready (gauge) | The number of ready replicas per ReplicationController |
kubernetes_state.replicationcontroller.replicas_desired (gauge) | Number of desired replicas for a ReplicationController |
kubernetes_state.replicationcontroller.replicas_available (gauge) | The number of available replicas per ReplicationController |
kubernetes_state.resourcequota.pods.used (gauge) | Observed number of pods used for a resource quota |
kubernetes_state.resourcequota.services.used (gauge) | Observed number of services used for a resource quota |
kubernetes_state.resourcequota.persistentvolumeclaims.used (gauge) | Observed number of persistent volume claims used for a resource quota |
kubernetes_state.resourcequota.services.nodeports.used (gauge) | Observed number of node ports used for a resource quota |
kubernetes_state.resourcequota.services.loadbalancers.used (gauge) | Observed number of loadbalancers used for a resource quota |
kubernetes_state.resourcequota.requests.cpu.used (gauge) | Observed sum of CPU cores requested for a resource quota Shown as cpu |
kubernetes_state.resourcequota.requests.memory.used (gauge) | Observed sum of memory bytes requested for a resource quota Shown as byte |
kubernetes_state.resourcequota.requests.storage.used (gauge) | Observed sum of storage bytes requested for a resource quota Shown as byte |
kubernetes_state.resourcequota.limits.cpu.used (gauge) | Observed sum of limits for CPU cores for a resource quota Shown as cpu |
kubernetes_state.resourcequota.limits.memory.used (gauge) | Observed sum of limits for memory bytes for a resource quota Shown as byte |
kubernetes_state.resourcequota.pods.limit (gauge) | Hard limit of the number of pods for a resource quota |
kubernetes_state.resourcequota.services.limit (gauge) | Hard limit of the number of services for a resource quota |
kubernetes_state.resourcequota.persistentvolumeclaims.limit (gauge) | Hard limit of the number of PVC for a resource quota |
kubernetes_state.resourcequota.services.nodeports.limit (gauge) | Hard limit of the number of node ports for a resource quota |
kubernetes_state.resourcequota.services.loadbalancers.limit (gauge) | Hard limit of the number of loadbalancers for a resource quota |
kubernetes_state.resourcequota.requests.cpu.limit (gauge) | Hard limit on the total of CPU core requested for a resource quota Shown as cpu |
kubernetes_state.resourcequota.requests.memory.limit (gauge) | Hard limit on the total of memory bytes requested for a resource quota Shown as byte |
kubernetes_state.resourcequota.requests.storage.limit (gauge) | Hard limit on the total of storage bytes requested for a resource quota Shown as byte |
kubernetes_state.resourcequota.limits.cpu.limit (gauge) | Hard limit on the sum of CPU core limits for a resource quota Shown as cpu |
kubernetes_state.resourcequota.limits.memory.limit (gauge) | Hard limit on the sum of memory bytes limits for a resource quota Shown as byte |
kubernetes_state.service.count (gauge) | Sum by namespace and type to count active services |
kubernetes_state.statefulset.replicas (gauge) | The number of replicas per statefulset |
kubernetes_state.statefulset.replicas_desired (gauge) | The number of desired replicas per statefulset |
kubernetes_state.statefulset.replicas_current (gauge) | The number of current replicas per StatefulSet |
kubernetes_state.statefulset.replicas_ready (gauge) | The number of ready replicas per StatefulSet |
kubernetes_state.statefulset.replicas_updated (gauge) | The number of updated replicas per StatefulSet |
kubernetes_state.telemetry.payload.size (gauge) | The message size received from kube-state-metrics Shown as byte |
kubernetes_state.telemetry.metrics.processed.count (count) | The number of metrics processed |
kubernetes_state.telemetry.metrics.input.count (count) | The number of metrics received |
kubernetes_state.telemetry.metrics.blacklist.count (count) | The number of metrics blacklisted by the check |
kubernetes_state.telemetry.metrics.ignored.count (count) | The number of metrics ignored by the check |
kubernetes_state.telemetry.collector.metrics.count (count) | The number of metrics by collector (kubernetes object kind) by kubernetes namespaces |
kubernetes_state.vpa.lower_bound (gauge) | The vpa lower bound recommendation |
kubernetes_state.vpa.target (gauge) | The vpa target recommendation |
kubernetes_state.vpa.uncapped_target (gauge) | The vpa uncapped recommendation recommendation |
kubernetes_state.vpa.upperbound (gauge) | The vpa upper bound recommendation |
kubernetes_state.vpa.update_mode (gauge) | The vpa update mode |
kubedns.response_size.bytes.sum (gauge) | Size of the returns response in bytes. Shown as byte |
kubedns.response_size.bytes.count (gauge) | Number of responses on which the kubedns.response_size.bytes.sum metric is evaluated. Shown as response |
kubedns.request_duration.seconds.sum (gauge) | Time (in seconds) each request took to resolve. Shown as second |
kubedns.request_duration.seconds.count (gauge) | Number of requests on which the kubedns.request_duration.seconds.sum metric is evaluated. Shown as request |
kubedns.request_count (gauge) | Total number of DNS requests made. Shown as request |
kubedns.request_count.count (count) | Instant number of DNS requests made. Shown as request |
kubedns.error_count (gauge) | Number of DNS requests resulting in an error. Shown as error |
kubedns.error_count.count (count) | Instant number of DNS requests made resulting in an error. Shown as error |
kubedns.cachemiss_count (gauge) | Number of DNS requests resulting in a cache miss. Shown as request |
kubedns.cachemiss_count.count (count) | Instant number of DNS requests made resulting in a cache miss. Shown as request |
kubeproxy.cpu.time (gauge) | Total user and system CPU time spent in seconds Shown as second |
kubeproxy.mem.resident (gauge) | Resident memory size in bytes Shown as byte |
kubeproxy.mem.virtual (gauge) | Virtual memory size in bytes Shown as byte |
kubeproxy.client.http.requests (gauge) | Number of HTTP requests partitioned by status code method and host Shown as request |
kubeproxy.sync_rules.latency.count (gauge) | SyncProxyRules latency count |
kubeproxy.sync_rules.latency.sum (gauge) | SyncProxyRules latency sum Shown as microsecond |
Datadog Agent の 5.17.0 リリース版では、Kubernetes イベントコレクター用に組み込みの[leader election オプション][9]をサポートしています。これを有効にした後は、クラスターに追加のイベントコレクションコンテナをデプロイする必要はありません。代わりに、Agent が一度に 1 つの Agent インスタンスのみがイベントを収集するように調整します。使用できるイベントは以下のとおりです。
Kubernetes チェックは、次のサービスチェックを含みます。
kubernetes.kubelet.check
:CRITICAL
の場合、kubernetes.kubelet.check.ping
または kubernetes.kubelet.check.syncloop
は、CRITICAL
または NO DATA
の状態になります。
kubernetes.kubelet.check.ping
:CRITICAL
または NO DATA
の場合、Kubelet の API は使用できません
kubernetes.kubelet.check.syncloop
:CRITICAL
または NO DATA
の場合、コンテナを更新する Kubelet の同期ループは機能しません。
kubernetes_state.node.ready
:
クラスターノードの準備ができていない場合は CRITICAL
を返します。それ以外の場合は OK
を返します。
kubernetes_state.node.out_of_disk
:
クラスターノードにディスク容量がない場合は CRITICAL
を返します。。それ以外の場合はOK
を返します。
kubernetes_state.node.disk_pressure
:
クラスターノードがディスク圧縮状態の場合は CRITICAL
を返します。それ以外の場合は OK
を返します。
kubernetes_state.node.memory_pressure
:
クラスターノードがメモリ圧縮状態の場合は CRITICAL
を返します。それ以外の場合は OK
を返します。
kubernetes_state.node.network_unavailable
:
クラスターノードがネットワーク使用不可能状態の場合は CRITICAL
を返します。それ以外の場合は OK
を返します。
kubernetes_state.cronjob.on_schedule_check
:
cron ジョブの予約時間が過ぎている場合は CRITICAL
を返します。それ以外の場合は OK
を返します。
このページ