- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
Etcd 메트릭을 수집하여 다음을 수행합니다.
Etcd 검사는 Datadog Agent 패키지에 포함되어 있으므로 Etcd 인스턴스에 다른 것을 설치할 필요가 없습니다.
호스트에서 실행 중인 에이전트에 대해 이 점검을 구성하려면:
conf.d/
폴더에서 etcd.d/conf.yaml
파일을 편집하여 Etcd 성능 데이터 수집을 시작하세요. 사용 가능한 모든 구성 옵션은 샘플 etcd.d/conf.yaml을 참조하세요.Datadog 에이전트에서 로그 수집은 기본적으로 사용하지 않도록 설정되어 있습니다. datadog.yaml
파일에서 로그 수집을 사용하도록 설정합니다.
logs_enabled: true
etcd.d/conf.yaml
의 하단에서 이 구성 블록의 주석 처리를 제거하고 편집합니다.
logs:
- type: file
path: "<LOG_FILE_PATH>"
source: etcd
service: "<SERVICE_NAME>"
환경에 따라 path
및 service
파라미터 값을 변경합니다. 사용 가능한 모든 구성 옵션은 샘플 etcd.d/conf.yaml을 참조하세요.
에이전트를 재시작하세요.
컨테이너화된 환경의 경우 자동탐지 통합 템플릿에 다음 파라미터를 적용하는 방법이 안내되어 있습니다.
파라미터 | 값 |
---|---|
<INTEGRATION_NAME> | etcd |
<INIT_CONFIG> | 비어 있음 또는 {} |
<INSTANCE_CONFIG> | {"prometheus_url": "http://%%host%%:2379/metrics"} |
Datadog 에이전트에서 로그 수집은 기본값으로 비활성화되어 있습니다. 이를 활성화하려면 쿠버네티스(Kubernetes) 로그 수집을 참조하세요.
파라미터 | 값 |
---|---|
<LOG_CONFIG> | {"source": "etcd", "service": "<SERVICE_NAME>"} |
Agent의 status
하위 명령을 실행하고 Checks 섹션에서 etcd
를 찾습니다.
etcd.debugging.mvcc.db.compaction.keys.total (count) | Total number of db keys compacted. Shown as key |
etcd.debugging.mvcc.db.compaction.pause.duration.milliseconds (gauge) | Bucketed histogram of db compaction pause duration. Shown as millisecond |
etcd.debugging.mvcc.db.compaction.total.duration.milliseconds (gauge) | Bucketed histogram of db compaction total duration. Shown as millisecond |
etcd.debugging.mvcc.db.total.size.in_bytes (gauge) | Total size of the underlying database in bytes. Shown as byte |
etcd.debugging.mvcc.delete.total (count) | Total number of deletes seen by this member. Shown as query |
etcd.debugging.mvcc.events.total (count) | Total number of events sent by this member. Shown as event |
etcd.debugging.mvcc.index.compaction.pause.duration.milliseconds (gauge) | Bucketed histogram of index compaction pause duration. Shown as millisecond |
etcd.debugging.mvcc.keys.total (gauge) | Total number of keys. Shown as key |
etcd.debugging.mvcc.pending.events.total (gauge) | Total number of pending events to be sent. Shown as event |
etcd.debugging.mvcc.put.total (count) | Total number of puts seen by this member. Shown as query |
etcd.debugging.mvcc.range.total (count) | Total number of ranges seen by this member. Shown as query |
etcd.debugging.mvcc.slow_watcher.total (gauge) | Total number of unsynced slow watchers. Shown as connection |
etcd.debugging.mvcc.txn.total (count) | Total number of txns seen by this member. Shown as transaction |
etcd.debugging.mvcc.watch_stream.total (gauge) | Total number of watch streams. Shown as connection |
etcd.debugging.mvcc.watcher.total (gauge) | Total number of watchers. Shown as connection |
etcd.debugging.server.lease.expired.total (count) | The total number of expired leases. Shown as item |
etcd.debugging.snap.save.marshalling.duration.seconds (gauge) | The marshalling cost distributions of save called by snapshot. Shown as second |
etcd.debugging.snap.save.total.duration.seconds (gauge) | The total latency distributions of save called by snapshot. Shown as second |
etcd.debugging.store.expires.total (count) | Total number of expired keys. Shown as key |
etcd.debugging.store.reads.total (count) | Total number of reads action by (get/getRecursive), local to this member. Shown as read |
etcd.debugging.store.watch.requests.total (count) | Total number of incoming watch requests (new or reestablished). Shown as request |
etcd.debugging.store.watchers (gauge) | Count of currently active watchers. Shown as connection |
etcd.debugging.store.writes.total (count) | Total number of writes (e.g. set/compareAndDelete) seen by this member. Shown as write |
etcd.disk.backend.commit.duration.seconds (gauge) | The latency distributions of commit called by backend. Shown as second |
etcd.disk.backend.snapshot.duration.seconds (gauge) | The latency distribution of backend snapshots. Shown as second |
etcd.disk.wal.fsync.duration.seconds.count (count) | The count of latency distributions of fsync called by wal. Shown as second |
etcd.disk.wal.fsync.duration.seconds.sum (gauge) | The sum of latency distributions of fsync called by wal. Shown as second |
etcd.disk.wal.write.bytes.total (gauge) | Total number of bytes written in WAL Shown as byte |
etcd.etcd.server.client.requests.total (count) | The total number of client requests per client version Shown as request |
etcd.go.gc.duration.seconds (gauge) | A summary of the GC invocation durations. Shown as second |
etcd.go.goroutines (gauge) | Number of goroutines that currently exist. Shown as thread |
etcd.go.info (gauge) | Information about the Go environment. Shown as item |
etcd.go.memstats.alloc.bytes (gauge) | Number of bytes allocated and still in use. Shown as byte |
etcd.go.memstats.alloc.bytes.total (count) | Total number of bytes allocated, even if freed. Shown as byte |
etcd.go.memstats.buck.hash.sys.bytes (gauge) | Number of bytes used by the profiling bucket hash table. Shown as byte |
etcd.go.memstats.frees.total (count) | Total number of frees. Shown as occurrence |
etcd.go.memstats.gc.cpu.fraction (gauge) | The fraction of this program's available CPU time used by the GC since the program started. Shown as cpu |
etcd.go.memstats.gc.sys.bytes (gauge) | Number of bytes used for garbage collection system metadata. Shown as byte |
etcd.go.memstats.heap.alloc.bytes (gauge) | Number of heap bytes allocated and still in use. Shown as byte |
etcd.go.memstats.heap.idle.bytes (gauge) | Number of heap bytes waiting to be used. Shown as byte |
etcd.go.memstats.heap.inuse.bytes (gauge) | Number of heap bytes that are in use. Shown as byte |
etcd.go.memstats.heap.objects (gauge) | Number of allocated objects. Shown as item |
etcd.go.memstats.heap.released.bytes (gauge) | Number of heap bytes released to OS. Shown as byte |
etcd.go.memstats.heap.sys.bytes (gauge) | Number of heap bytes obtained from system. Shown as byte |
etcd.go.memstats.last.gc.time.seconds (gauge) | Number of seconds since 1970 of last garbage collection. Shown as second |
etcd.go.memstats.lookups.total (count) | Total number of pointer lookups. Shown as occurrence |
etcd.go.memstats.mallocs.total (count) | Total number of mallocs. Shown as occurrence |
etcd.go.memstats.mcache.inuse.bytes (gauge) | Number of bytes in use by mcache structures. Shown as byte |
etcd.go.memstats.mcache.sys.bytes (gauge) | Number of bytes used for mcache structures obtained from system. Shown as byte |
etcd.go.memstats.mspan.inuse.bytes (gauge) | Number of bytes in use by mspan structures. Shown as byte |
etcd.go.memstats.mspan.sys.bytes (gauge) | Number of bytes used for mspan structures obtained from system. Shown as byte |
etcd.go.memstats.next.gc.bytes (gauge) | Number of heap bytes when next garbage collection will take place. Shown as byte |
etcd.go.memstats.other.sys.bytes (gauge) | Number of bytes used for other system allocations. Shown as byte |
etcd.go.memstats.stack.inuse.bytes (gauge) | Number of bytes in use by the stack allocator. Shown as byte |
etcd.go.memstats.stack.sys.bytes (gauge) | Number of bytes obtained from system for stack allocator. Shown as byte |
etcd.go.memstats.sys.bytes (gauge) | Number of bytes obtained from system. Shown as byte |
etcd.go.threads (gauge) | Number of OS threads created. Shown as thread |
etcd.grpc.proxy.cache.hits.total (gauge) | Total number of cache hits Shown as occurrence |
etcd.grpc.proxy.cache.keys.total (gauge) | Total number of keys/ranges cached Shown as item |
etcd.grpc.proxy.cache.misses.total (gauge) | Total number of cache misses Shown as occurrence |
etcd.grpc.proxy.events.coalescing.total (count) | Total number of events coalescing Shown as event |
etcd.grpc.proxy.watchers.coalescing.total (gauge) | Total number of current watchers coalescing Shown as connection |
etcd.grpc.server.handled.total (count) | Total number of RPCs completed on the server, regardless of success or failure. Shown as operation |
etcd.grpc.server.msg.received.total (count) | Total number of RPC stream messages received on the server. Shown as operation |
etcd.grpc.server.msg.sent.total (count) | Total number of gRPC stream messages sent by the server. Shown as operation |
etcd.grpc.server.started.total (count) | Total number of RPCs started on the server. Shown as operation |
etcd.leader.counts.fail (gauge) | Rate of failed Raft RPC requests (ETCD API V2 only) Shown as request |
etcd.leader.counts.success (gauge) | Rate of successful Raft RPC requests (ETCD API V2 only) Shown as request |
etcd.leader.latency.avg (gauge) | Average latency to each peer in the cluster (ETCD API V2 only) Shown as millisecond |
etcd.leader.latency.current (gauge) | Current latency to each peer in the cluster (ETCD API V2 only) Shown as millisecond |
etcd.leader.latency.max (gauge) | Maximum latency to each peer in the cluster (ETCD API V2 only) Shown as millisecond |
etcd.leader.latency.min (gauge) | Minimum latency to each peer in the cluster (ETCD API V2 only) Shown as millisecond |
etcd.leader.latency.stddev (gauge) | Standard deviation latency to each peer in the cluster (ETCD API V2 only) Shown as millisecond |
etcd.mvcc.db.total.size.in_use.bytes (gauge) | Total size of the underlying database logically in use Shown as byte |
etcd.network.active_peers (gauge) | The current number of active peer connections Shown as connection |
etcd.network.client.grpc.received.bytes.total (count) | The total number of bytes received from grpc clients. Shown as byte |
etcd.network.client.grpc.sent.bytes.total (count) | The total number of bytes sent to grpc clients. Shown as byte |
etcd.network.disconnected_peers.total (count) | The total number of disconnected peers Shown as connection |
etcd.network.peer.received.bytes.total (count) | The total number of bytes received from peers. Shown as byte |
etcd.network.peer.received.failures.total (count) | The total number of receive failures from peers Shown as event |
etcd.network.peer.round_trip_time.seconds (gauge) | Round-Trip-Time histogram between peers. Shown as second |
etcd.network.peer.sent.bytes.total (count) | The total number of bytes sent to peers. Shown as byte |
etcd.network.peer.sent.failures.total (count) | The total number of send failures from peers Shown as event |
etcd.network.snapshot.receive.failures.total (count) | Total number of snapshot receive failures Shown as event |
etcd.network.snapshot.receive.inflights.total (gauge) | Total number of inflight snapshot sends Shown as event |
etcd.network.snapshot.receive.success.total (count) | Total number of successful snapshot receives Shown as event |
etcd.network.snapshot.receive.total.duration.seconds.count (gauge) | Total latency distributions of v3 snapshot receives Shown as second |
etcd.network.snapshot.receive.total.duration.seconds.sum (gauge) | Total latency distributions of v3 snapshot receives Shown as second |
etcd.network.snapshot.send.failures.total (count) | The total number of send failures from peers Shown as event |
etcd.network.snapshot.send.inflights.total (gauge) | Total number of inflight snapshot receives Shown as event |
etcd.network.snapshot.send.sucess.total (count) | Total number of successful snapshot sends Shown as event |
etcd.network.snapshot.send.total.duration.seconds.count (gauge) | Total latency distributions of v3 snapshot sends Shown as second |
etcd.network.snapshot.send.total.duration.seconds.sum (gauge) | Total latency distributions of v3 snapshot sends Shown as second |
etcd.os.fd.limit (gauge) | The file descriptor limit Shown as object |
etcd.os.fd.used (gauge) | The number of used file descriptors Shown as object |
etcd.process.cpu.seconds.total (count) | Total user and system CPU time spent in seconds. Shown as cpu |
etcd.process.max.fds (gauge) | Maximum number of open file descriptors. Shown as item |
etcd.process.open.fds (gauge) | Number of open file descriptors. Shown as item |
etcd.process.resident.memory.bytes (gauge) | Resident memory size in bytes. Shown as byte |
etcd.process.start.time.seconds (gauge) | Start time of the process since unix epoch in seconds. Shown as second |
etcd.process.virtual.memory.bytes (gauge) | Virtual memory size in bytes. Shown as byte |
etcd.self.recv.appendrequest.count (gauge) | Rate of append requests this node has processed (ETCD API V2 only) Shown as request |
etcd.self.recv.bandwidthrate (gauge) | Rate of bytes received (ETCD API V2 only) Shown as byte |
etcd.self.recv.pkgrate (gauge) | Rate of packets received (ETCD API V2 only) Shown as packet |
etcd.self.send.appendrequest.count (gauge) | Rate of append requests this node has sent (ETCD API V2 only) Shown as request |
etcd.self.send.bandwidthrate (gauge) | Rate of bytes sent (ETCD API V2 only) Shown as byte |
etcd.self.send.pkgrate (gauge) | Rate of packets sent (ETCD API V2 only) Shown as packet |
etcd.server.apply.slow.total (count) | The total number of slow apply requests (likely overloaded from slow disk) Shown as request |
etcd.server.go_version (gauge) | Which Go version server is running with. 1 with label with current version Shown as unit |
etcd.server.has_leader (gauge) | Whether or not a leader exists. 1 is existence, 0 is not. Shown as check |
etcd.server.health.failures.total (count) | The total number of failed health checks Shown as event |
etcd.server.health.success.total (count) | The total number of successful health checks Shown as event |
etcd.server.heartbeat.send.failures.total (count) | The total number of leader heartbeat send failures (likely overloaded from slow disk) Shown as event |
etcd.server.is_leader (gauge) | Whether or not this member is a leader. 1 if is, 0 otherwise. Shown as check |
etcd.server.leader.changes.seen.total (count) | The number of leader changes seen. Shown as event |
etcd.server.lease.expired.total (count) | The total number of expired leases Shown as occurrence |
etcd.server.proposals.applied.total (gauge) | The total number of consensus proposals applied. Shown as occurrence |
etcd.server.proposals.committed.total (gauge) | The total number of consensus proposals committed. Shown as occurrence |
etcd.server.proposals.failed.total (count) | The total number of failed proposals seen. Shown as occurrence |
etcd.server.proposals.pending (gauge) | The current number of pending proposals to commit. Shown as occurrence |
etcd.server.quota.backend.bytes (gauge) | Current backend storage quota size in bytes Shown as byte |
etcd.server.read_indexes.failed.total (count) | The total number of failed read indexes seen Shown as event |
etcd.server.read_indexes.slow.total (count) | The total number of pending read indexes not in sync with leader or timed out read index requests Shown as event |
etcd.server.version (gauge) | Which version is running. 1 for 'server_version' label with current version. Shown as item |
etcd.snap.db.fsync.duration.seconds.count (gauge) | The latency distributions of fsyncing .snap.db file Shown as second |
etcd.snap.db.fsync.duration.seconds.sum (gauge) | The latency distributions of fsyncing .snap.db file Shown as second |
etcd.snap.db.save.total.duration.seconds.count (gauge) | The total latency distributions of v3 snapshot save Shown as second |
etcd.snap.db.save.total.duration.seconds.sum (gauge) | The total latency distributions of v3 snapshot save Shown as second |
etcd.snap.fsync.duration.seconds.count (gauge) | The latency distributions of fsync called by snap Shown as second |
etcd.snap.fsync.duration.seconds.sum (gauge) | The latency distributions of fsync called by snap Shown as second |
etcd.store.compareanddelete.fail (gauge) | Rate of compare and delete requests failure (ETCD API V2 only) Shown as request |
etcd.store.compareanddelete.success (gauge) | Rate of compare and delete requests success (ETCD API V2 only) Shown as request |
etcd.store.compareandswap.fail (gauge) | Rate of compare and swap requests failure (ETCD API V2 only) Shown as request |
etcd.store.compareandswap.success (gauge) | Rate of compare and swap requests success (ETCD API V2 only) Shown as request |
etcd.store.create.fail (gauge) | Rate of failed create requests (ETCD API V2 only) Shown as request |
etcd.store.create.success (gauge) | Rate of successful create requests (ETCD API V2 only) Shown as request |
etcd.store.delete.fail (gauge) | Rate of failed delete requests (ETCD API V2 only) Shown as request |
etcd.store.delete.success (gauge) | Rate of successful delete requests (ETCD API V2 only) Shown as request |
etcd.store.expire.count (gauge) | Rate of expired keys (ETCD API V2 only) Shown as eviction |
etcd.store.gets.fail (gauge) | Rate of failed get requests (ETCD API V2 only) Shown as request |
etcd.store.gets.success (gauge) | Rate of successful get requests (ETCD API V2 only) Shown as request |
etcd.store.sets.fail (gauge) | Rate of failed set requests (ETCD API V2 only) Shown as request |
etcd.store.sets.success (gauge) | Rate of successful set requests (ETCD API V2 only) Shown as request |
etcd.store.update.fail (gauge) | Rate of failed update requests (ETCD API V2 only) Shown as request |
etcd.store.update.success (gauge) | Rate of successful update requests (ETCD API V2 only) Shown as request |
etcd.store.watchers (gauge) | Rate of watchers(ETCD API V2 only) |
Etcd 메트릭에는 노드 상태에 따라 etcd_state:leader
또는 etcd_state:follower
태그가 지정되므로 상태별로 메트릭을 쉽게 집계할 수 있습니다.
Etcd 점검은 이벤트를 포함하지 않습니다.
etcd.can_connect
Returns CRITICAL
if unable to get metrics from etcd (timeout or non-200 HTTP code). This service check is only available on the legacy version of the etcd check.
Statuses: ok, critical
etcd.healthy
Returns CRITICAL
when a member is unhealthy. This service check is only available on the legacy version of the etcd check.
Statuses: ok, critical, unknown
etcd.prometheus.health
Returns CRITICAL
if the check cannot access a metrics endpoint. Otherwise, returns OK
. This service check is only available when use_preview
is enabled.
Statuses: ok, critical
도움이 필요하신가요? Datadog 지원 팀에 문의하세요.