시스템 점검

Supported OS Linux Mac OS Windows

개요

기본 시스템에서 CPU, IO, 로드, 메모리, 스왑, 가동 시간에 관한 메트릭을 받으세요. 다음 점검 또한 시스템 관련입니다.

설정

설치

Datadog 에이전트 패키지에 시스템 점검이 포함되어 있어 서버에 추가 설치가 필요 없습니다.

수집된 데이터

메트릭

system.cpu.context_switches
(count)
Count of the number of context switches
system.cpu.guest
(gauge)
The percent of time the CPU spent running the virtual processor. Only applies to hypervisors.
Shown as percent
system.cpu.idle
(gauge)
Percent of time the CPU spent in an idle state.
Shown as percent
system.cpu.interrupt
(gauge)
The percentage of time that the processor is spending on handling Interrupts.
Shown as percent
system.cpu.iowait
(gauge)
The percent of time the CPU spent waiting for IO operations to complete.
Shown as percent
system.cpu.stolen
(gauge)
The percent of time the virtual CPU spent waiting for the hypervisor to service another virtual CPU. Only applies to virtual machines.
Shown as percent
system.cpu.system
(gauge)
The percent of time the CPU spent running the kernel.
Shown as percent
system.cpu.user
(gauge)
The percent of time the CPU spent running user space processes.
Shown as percent
system.cpu.num_cores
(gauge)
The number of CPU cores
system.fs.file_handles.allocated
(gauge)
Number of allocated file handles over the system.
Shown as file
system.fs.file_handles.allocated_unused
(gauge)
Number of allocated file handles unused over the system.
Shown as file
system.fs.file_handles.in_use
(gauge)
The amount of used allocated file handles over the system max.
Shown as fraction
system.fs.file_handles.max
(gauge)
Maximum of allocated files handles over the system
Shown as file
system.fs.file_handles.used
(gauge)
Number of allocated file handles used over the system.
Shown as file
system.fs.inodes.free
(gauge)
The number of free inodes.
Shown as inode
system.fs.inodes.in_use
(gauge)
The number of inodes in use as a fraction of the total.
Shown as fraction
system.fs.inodes.total
(gauge)
The total number of inodes.
Shown as inode
system.fs.inodes.used
(gauge)
The number of inodes in use.
Shown as inode
system.io.avg_q_sz
(gauge)
The average queue size of requests issued to the device.
Shown as request
system.io.avg_rq_sz
(gauge)
The average size of requests issued to the device (Linux only).
Shown as sector
system.io.await
(gauge)
The average time for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them (Linux only).
Shown as millisecond
system.io.block_in
(gauge)
The amount of I/O block read per second.
Shown as block
system.io.block_out
(gauge)
The amount of I/O block written per second.
Shown as block
system.io.bytes_per_s
(gauge)
Byte transfer rate for this device (Datadog Agent v5 on Darwin-Mac only).
Shown as byte
system.io.r_await
(gauge)
The average time for read requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them (Linux only).
Shown as millisecond
system.io.r_s
(gauge)
The number of read requests issued to the device per second.
Shown as request
system.io.rkb_s
(gauge)
The number of kibibytes read from the device per second.
Shown as kibibyte
system.io.rrqm_s
(gauge)
The number of read requests merged per second that were queued to the device (Linux only).
Shown as request
system.io.svctm
(gauge)
The average service time for requests issued to the device (Linux only).
Shown as millisecond
system.io.util
(gauge)
The percent of CPU time during which I/O requests were issued to the device (Linux only).
Shown as percent
system.io.w_await
(gauge)
The average time for write requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them (Linux only).
Shown as millisecond
system.io.w_s
(gauge)
The number of write requests issued to the device per second.
Shown as request
system.io.wkb_s
(gauge)
The number of kibibytes written to the device per second.
Shown as kibibyte
system.io.wrqm_s
(gauge)
The number of write requests merged per second that were queued to the device (Linux only).
Shown as request
system.load.1
(gauge)
The average system load over one minute. (Linux only)
system.load.15
(gauge)
The average system load over fifteen minutes. (Linux only)
system.load.5
(gauge)
The average system load over five minutes. (Linux only)
system.load.norm.1
(gauge)
The average system load over one minute normalized by the number of CPUs. (Linux only)
system.load.norm.15
(gauge)
The average system load over fifteen minutes normalized by the number of CPUs. (Linux only)
system.load.norm.5
(gauge)
The average system load over five minutes normalized by the number of CPUs. (Linux only)
system.mem.buffered
(gauge)
The amount of physical RAM used for file buffers.
Shown as byte
system.mem.cached
(gauge)
The amount of physical RAM used as cache memory.
Shown as byte
system.mem.commit_limit
(gauge)
The total amount of memory currently available to be allocated on the system, based on the overcommit ratio. (Linux only)
Shown as byte
system.mem.committed
(gauge)
The amount of physical memory for which space has been reserved on the disk paging file in case it must be written back to disk.
Shown as byte
system.mem.committed_as
(gauge)
The amount of memory presently allocated on the system, even if it has not been "used" by processes as of yet. (Linux only)
Shown as byte
system.mem.free
(gauge)
The amount of free RAM.
Shown as byte
system.mem.nonpaged
(gauge)
The amount of physical memory used by the OS for objects that cannot be written to disk, but must remain in physical memory as long as they are allocated.
Shown as byte
system.mem.page_free
(gauge)
The amount of the page file that's free. Reported by Windows Agents in versions < 5.12.
Shown as byte
system.mem.page_pct_free
(gauge)
The amount of the page file in use as a fraction of the total. Reported by Windows Agents in versions < 5.12.
Shown as fraction
system.mem.page_tables
(gauge)
The amount of memory dedicated to the lowest page table level. Reported by Windows Agents in versions < 5.12.
Shown as byte
system.mem.page_total
(gauge)
The total size of the page file. Reported by Windows Agents in versions < 5.12.
Shown as byte
system.mem.page_used
(gauge)
The amount of the page file in use. Reported by Windows Agents in versions < 5.12.
Shown as byte
system.mem.paged
(gauge)
The amount of physical memory used by the OS for objects that can be written to disk when they are not in use.
Shown as byte
system.mem.pagefile.free
(gauge)
The maximum amount of memory the Agent process can commit, in bytes. This value is equal to or smaller than the system-wide available commit value. See MEMORYSTATUSEX::ullAvailPageFile. Reported by Windows Agents from version 5.12 to 6.0 and 6/7.14 onwards.
Shown as byte
system.mem.pagefile.pct_free
(gauge)
The maximum amount of memory the Agent process can commit as a fraction of the current committed memory limit. Reported by Windows Agents from version 5.12 to 6.0 and 6/7.14 onwards.
Shown as fraction
system.mem.pagefile.total
(gauge)
The current committed memory limit for the system or the Agent process, whichever is smaller, in bytes. See MEMORYSTATUSEX::ullTotalPageFile. Reported by Windows Agents from version 5.12 to 6.0 and 6/7.14 onwards.
Shown as byte
system.mem.pagefile.used
(gauge)
The current committed memory limit minus the maximum amount of memory the Agent process can commit. Reported by Windows Agents from version 5.12 to 6.0 and 6/7.14 onwards.
Shown as byte
system.mem.pct_usable
(gauge)
The amount of usable physical RAM as a fraction of the total.
Shown as fraction
system.mem.shared
(gauge)
The amount of physical RAM used as shared memory.
Shown as byte
system.mem.slab
(gauge)
The amount of memory used by the kernel to cache data structures for its own use.
Shown as byte
system.mem.slab_reclaimable
(gauge)
The part of slab memory that might be reclaimed (i.e. caches)
Shown as byte
system.mem.total
(gauge)
The total amount of physical RAM.
Shown as byte
system.mem.usable
(gauge)
Value of MemAvailable from /proc/meminfo if present, but falls back to adding free + buffered + cached memory if not.
Shown as byte
system.mem.used
(gauge)
The amount of RAM in use.
Shown as byte
system.proc.count
(gauge)
The number of processes (Windows only).
Shown as process
system.proc.queue_length
(gauge)
The number of threads that are observed as delayed in the processor ready queue and are waiting to be executed (Windows only).
Shown as thread
system.swap.cached
(gauge)
The amount of swap used as cache memory.
Shown as byte
system.swap.free
(gauge)
The amount of free swap space.
Shown as byte
system.swap.pct_free
(gauge)
The amount of swap space not in use as a fraction of the total.
Shown as fraction
system.swap.swapped_in
(gauge)
Bytes of memory swapped in
system.swap.swapped_out
(gauge)
Bytes of memory swapped out
system.swap.total
(gauge)
The total amount of swap space.
Shown as byte
system.swap.used
(gauge)
The amount of swap space in use.
Shown as byte
system.swap.swap_in
(gauge)
The amount of memory swapped in.
Shown as byte
system.swap.swap_out
(gauge)
The amount of memory swapped out.
Shown as byte
system.uptime
(gauge)
The amount of time the system has been working and available.
Shown as second

이벤트

시스템 점검에는 이벤트가 포함되어 있지 않습니다.

서비스 검사

시스템 점검에는 서비스 점검이 포함되어 있지 않습니다.

태그

모든 시스템 메트릭은 host:<HOST_NAME>으로 자동 태깅됩니다. 또한 다음 네임스페이스가 device:<DEVICE_NAME>으로 태깅됩니다.

  • system.disk.*
  • system.fs.inodes.*
  • system.io.*
  • system.net.*

시스템 코어

이 점검은 시스템, 사용자, 유휴 등과 같이 호스트와 CPU 시간의 CPU 코어 수치를 수집합니다.

설정

설치

시스템 코어 점검에는 Datadog 에이전트 패키지가 포함되어 있어 서버에 추가 설치가 필요 없습니다.

구성

  1. 에이전트 구성 디렉터리conf.d/ 폴더에 system_core.d/conf.yaml 파일을 편집합니다. 사용할 수 있는 구성 옵션 전체를 보려면 샘플 system_core.d/conf.yaml를 참고하세요. 참고: 점검을 활성화 하려면 instances아래 최소 하나의 항목이 필요합니다. 다음 예를 참고하세요.

    init_config:
    instances:
        - foo: bar
        tags:
            - key:value
    
  2. Agent를 다시 시작합니다.

검증

에이전트 상태 하위 명령을 실행하고 점검 섹션 아래에서 system_core를 찾습니다.

수집된 데이터

메트릭

system.core.count
(gauge)
The number of CPU cores on the host
Shown as core
system.core.user
(gauge)
The percentage of time a given CPU core has spent in user mode
Shown as percent
system.core.user.total
(gauge)
The percentage of time the whole CPU has spent in user mode
Shown as percent
system.core.system
(gauge)
The percentage of time a given CPU core has spent in kernel mode
Shown as percent
system.core.system.total
(gauge)
The percentage of time the whole CPU has spent in kernel mode
Shown as percent
system.core.idle
(gauge)
The percentage of time a given CPU core has spent idle
Shown as percent
system.core.idle.total
(gauge)
The percentage of time the whole CPU has spent idle
Shown as percent
system.core.nice
(gauge)
[Unix] The percentage of time a given CPU core has spent in niced (prioritized) processes in user mode
Shown as percent
system.core.nice.total
(gauge)
[Unix] The percentage of time the whole CPU has spent in niced (prioritized) processes in user mode
Shown as percent
system.core.guest
(gauge)
[Linux] The percentage of time a given CPU core has spent running a virtual CPU for guest operating systems under the control of the Linux kernel
Shown as percent
system.core.guest.total
(gauge)
[Linux] The percentage of time the whole CPU has spent running a virtual CPU for guest operating systems under the control of the Linux kernel
Shown as percent
system.core.iowait
(gauge)
[Linux] The percentage of time a given CPU core has spent waiting for I/O to complete
Shown as percent
system.core.iowait.total
(gauge)
[Linux] The percentage of time the whole CPU has spent waiting for I/O to complete
Shown as percent
system.core.irq
(gauge)
[Linux, BSD] The percentage of time a given CPU core has spent servicing hardware interrupts
Shown as percent
system.core.irq.total
(gauge)
[Linux, BSD] The percentage of time the whole CPU has spent servicing hardware interrupts
Shown as percent
system.core.softirq
(gauge)
[Linux, BSD] The percentage of time a given CPU core has spent servicing software interrupts
Shown as percent
system.core.softirq.total
(gauge)
[Linux, BSD] The percentage of time the whole CPU has spent servicing software interrupts
Shown as percent
system.core.guest_nice
(gauge)
[Linux] The percentage of time a given CPU core has spent running a niced guest
Shown as percent
system.core.guest_nice.total
(gauge)
[Linux] The percentage of time the whole CPU has spent running running a niced guest
Shown as percent
system.core.steal
(gauge)
[Linux] The percentage of time a given CPU core has spent running a virtual CPU for guest operating systems under the control of the Linux kernel
Shown as percent
system.core.steal.total
(gauge)
[Linux] The percentage of time a given CPU core has spent running a virtual CPU for guest operating systems under the control of the Linux kernel
Shown as percent
system.core.interrupt
(gauge)
[Windows] The percentage of time a given CPU core has spent servicing hardware interrupts
Shown as percent
system.core.interrupt.total
(gauge)
[Windows] The percentage of time the whole CPU has spent servicing hardware interrupts
Shown as percent
system.core.dpc
(gauge)
[Windows] The percentage of time a given CPU core has spent servicing deferred procedure calls (DPCs)
Shown as percent
system.core.dpc.total
(gauge)
[Windows] The percentage of time the whole CPU has spent servicing deferred procedure calls (DPCs)
Shown as percent
system.core.frequency
(gauge)
The frequency or clock speed a given CPU
Shown as megahertz

플랫폼에 따라 점검에서 CPU 시간 메트릭 외 Windows system.core.interrupt와 Linux system.core.iowait 등과 같은 다른 메트릭을 수집할 수 있습니다.

이벤트

시스템 코어 점검에는 이벤트가 포함되지 않습니다.

서비스 검사

datadog.agent.up
Returns OK if the Agent is running properly. Alerts are created if the host does not respond.
Statuses: ok, critical

시스템 스왑

이 점검에서는 호스트 내외부에서 스왑된 바이트 수치를 모니터링합니다.

설정

설치

시스템 스왑 점검은 Datadog 에이전트 패키지에 포함되어 있어 서버에 추가 설치가 필요 없습니다.

구성

  1. 에이전트 구성 디렉터리 루트에 있는 conf.d/ 폴더의 system_swap.d/conf.yaml 파일을 편집합니다. 사용할 수 있는 모든 옵션을 보려면 샘플 system_swap.d/conf.yaml을 참고하세요. 참고: 이 점검에는 초기 구성이 필요 없습니다.

  2. Agent를 다시 시작합니다.

검증

에이전트의 상태 하위 명령을 실행하고 점검 섹션 아래에서 system_swap을 찾습니다.

수집된 데이터

메트릭

system.swap.swapped_in
(gauge)
Bytes of memory swapped in
system.swap.swapped_out
(gauge)
Bytes of memory swapped out

이벤트

시스템 스왑 점검에는 이벤트가 포함되어 있지 않습니다.

서비스 검사

시스템 스왑 점검에는 서비스 점검이 포함되지 않습니다.