- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
실시간으로 Cassandra에서 메트릭을 받아 다음을 수행할 수 있습니다.
Cassandra 점검은 Datadog 에이전트 패키지에 포함되어 있으므로 Cassandra 노드에 아무 것도 설치할 필요가 없습니다. 이 통합을 위해 Oracle의 JDK를 사용하는 것이 좋습니다.
참고: 이 점검에는 인스턴트당 350개 메트릭 제한이 적용됩니다. 반환되는 메트릭의 수는 상태 페이지에 나와 있습니다. 아래 설정을 편집하여 관심 있는 메트릭을 지정할 수 있습니다. 메트릭을 커스터마이즈하는 방법을 알아보려면 상세한 지침을 JMX 설명서를 참조하세요. 더 많은 메트릭을 모니터링해야 한다면 Datadog 지원팀에 문의해 주세요.
cassandra.d/conf.yaml
파일의 기본 설정은 Cassandra 메트릭 수집을 활성화합니다. 사용 가능한 모든 설정 옵션은 sample cassandra.d/conf.yaml을 참조하세요.
Agent를 다시 시작합니다.
에이전트 버전 > 6.0 이상 사용 가능
컨테이너화된 환경의 경우 쿠버네티스(Kubernetes) 로그 수집 또는 도커(Docker) 로그 수집 페이지의 지침을 따르세요.
Datadog 에이전트에서 로그 수집은 기본적으로 사용하지 않도록 설정되어 있습니다. datadog.yaml
파일에서 로그 수집을 사용하도록 설정합니다.
logs_enabled: true
이 설정 블록을 cassandra.d/conf.yaml
파일에 추가하여 Cassandra 로그 수집을 시작하세요.
logs:
- type: file
path: /var/log/cassandra/*.log
source: cassandra
service: myapplication
log_processing_rules:
- type: multi_line
name: log_start_with_date
# pattern to match: DEBUG [ScheduledTasks:1] 2019-12-30
pattern: '[A-Z]+ +\[[^\]]+\] +\d{4}-\d{2}-\d{2}'
path
및 service
파라미터 값을 변경하고 환경에 맞게 설정하세요. 사용 가능한 모든 설정 옵션은 sample cassandra.d/conf.yaml을 참조하세요.
스택트레이스가 적절하게 단일 로그로 집계되었는지 확인하려면 멀티라인 프로세싱 규칙을 추가할 수 있습니다.
Agent를 다시 시작합니다.
에이전트 상태 하위 명령을 실행하고 점검 섹션 아래에서 cassandra
를 찾습니다.
cassandra.active_tasks (gauge) | The number of tasks that the thread pool is actively executing. Shown as task |
cassandra.bloom_filter_false_ratio (gauge) | The ratio of Bloom filter false positives to total checks. Shown as fraction |
cassandra.bytes_flushed.count (gauge) | The amount of data that was flushed since (re)start. Shown as byte |
cassandra.cas_commit_latency.75th_percentile (gauge) | The latency of paxos commit round - p75. Shown as microsecond |
cassandra.cas_commit_latency.95th_percentile (gauge) | The latency of paxos commit round - p95. Shown as microsecond |
cassandra.cas_commit_latency.one_minute_rate (gauge) | The number of paxos commit round per second. Shown as operation |
cassandra.cas_prepare_latency.75th_percentile (gauge) | The latency of paxos prepare round - p75. Shown as microsecond |
cassandra.cas_prepare_latency.95th_percentile (gauge) | The latency of paxos prepare round - p95. Shown as microsecond |
cassandra.cas_prepare_latency.one_minute_rate (gauge) | The number of paxos prepare round per second. Shown as operation |
cassandra.cas_propose_latency.75th_percentile (gauge) | The latency of paxos propose round - p75. Shown as microsecond |
cassandra.cas_propose_latency.95th_percentile (gauge) | The latency of paxos propose round - p95. Shown as microsecond |
cassandra.cas_propose_latency.one_minute_rate (gauge) | The number of paxos propose round per second. Shown as operation |
cassandra.col_update_time_delta_histogram.75th_percentile (gauge) | The column update time delta - p75. Shown as microsecond |
cassandra.col_update_time_delta_histogram.95th_percentile (gauge) | The column update time delta - p95. Shown as microsecond |
cassandra.col_update_time_delta_histogram.min (gauge) | The column update time delta - min. Shown as microsecond |
cassandra.compaction_bytes_written.count (gauge) | The amount of data that was compacted since (re)start. Shown as byte |
cassandra.compression_ratio (gauge) | The compression ratio for all SSTables. /!\ A low value means a high compression contrary to what the name suggests. Formula used is: 'size of the compressed SSTable / size of original' Shown as fraction |
cassandra.currently_blocked_tasks (gauge) | The number of currently blocked tasks for the thread pool. Shown as task |
cassandra.currently_blocked_tasks.count (gauge) | The number of currently blocked tasks for the thread pool. Shown as task |
cassandra.db.droppable_tombstone_ratio (gauge) | The estimate of the droppable tombstone ratio. Shown as fraction |
cassandra.dropped.one_minute_rate (gauge) | The tasks dropped during execution for the thread pool. Shown as thread |
cassandra.exceptions.count (gauge) | The number of exceptions thrown from 'Storage' metrics. Shown as error |
cassandra.key_cache_hit_rate (gauge) | The key cache hit rate. Shown as fraction |
cassandra.latency.75th_percentile (gauge) | The client request latency - p75. Shown as microsecond |
cassandra.latency.95th_percentile (gauge) | The client request latency - p95. Shown as microsecond |
cassandra.latency.one_minute_rate (gauge) | The number of client requests. Shown as request |
cassandra.live_disk_space_used.count (gauge) | The disk space used by "live" SSTables (only counts in use files). Shown as byte |
cassandra.live_ss_table_count (gauge) | Number of "live" (in use) SSTables. Shown as file |
cassandra.load.count (gauge) | The disk space used by live data on a node. Shown as byte |
cassandra.max_partition_size (gauge) | The size of the largest compacted partition. Shown as byte |
cassandra.max_row_size (gauge) | The size of the largest compacted row. Shown as byte |
cassandra.mean_partition_size (gauge) | The average size of compacted partition. Shown as byte |
cassandra.mean_row_size (gauge) | The average size of compacted rows. Shown as byte |
cassandra.net.down_endpoint_count (gauge) | The number of unhealthy nodes in the cluster. They represent each individual node's view of the cluster and thus should not be summed across reporting nodes. Shown as node |
cassandra.net.up_endpoint_count (gauge) | The number of healthy nodes in the cluster. They represent each individual node's view of the cluster and thus should not be summed across reporting nodes. Shown as node |
cassandra.pending_compactions (gauge) | The number of pending compactions. Shown as task |
cassandra.pending_flushes.count (gauge) | The number of pending flushes. Shown as flush |
cassandra.pending_tasks (gauge) | The number of pending tasks for the thread pool. Shown as task |
cassandra.range_latency.75th_percentile (gauge) | The local range request latency - p75. Shown as microsecond |
cassandra.range_latency.95th_percentile (gauge) | The local range request latency - p95. Shown as microsecond |
cassandra.range_latency.one_minute_rate (gauge) | The number of local range requests. Shown as request |
cassandra.read_latency.75th_percentile (gauge) | The local read latency - p75. Shown as microsecond |
cassandra.read_latency.95th_percentile (gauge) | The local read latency - p95. Shown as microsecond |
cassandra.read_latency.99th_percentile (gauge) | The local read latency - p99. Shown as microsecond |
cassandra.read_latency.one_minute_rate (gauge) | The number of local read requests. Shown as read |
cassandra.row_cache_hit.count (gauge) | The number of row cache hits. Shown as hit |
cassandra.row_cache_hit_out_of_range.count (gauge) | The number of row cache hits that do not satisfy the query filter and went to disk. Shown as hit |
cassandra.row_cache_miss.count (gauge) | The number of table row cache misses. Shown as miss |
cassandra.snapshots_size (gauge) | The disk space truly used by snapshots. Shown as byte |
cassandra.ss_tables_per_read_histogram.75th_percentile (gauge) | The number of SSTable data files accessed per read - p75. Shown as file |
cassandra.ss_tables_per_read_histogram.95th_percentile (gauge) | The number of SSTable data files accessed per read - p95. Shown as file |
cassandra.timeouts.count (gauge) | Count of requests not acknowledged within configurable timeout window. Shown as timeout |
cassandra.timeouts.one_minute_rate (gauge) | Recent timeout rate, as an exponentially weighted moving average over a one-minute interval. Shown as timeout |
cassandra.tombstone_scanned_histogram.75th_percentile (gauge) | Number of tombstones scanned per read - p75. Shown as record |
cassandra.tombstone_scanned_histogram.95th_percentile (gauge) | Number of tombstones scanned per read - p95. Shown as record |
cassandra.total_blocked_tasks (gauge) | Total blocked tasks Shown as task |
cassandra.total_blocked_tasks.count (count) | Total count of blocked tasks Shown as task |
cassandra.total_commit_log_size (gauge) | The size used on disk by commit logs. Shown as byte |
cassandra.total_disk_space_used.count (gauge) | Total disk space used by SSTables including obsolete ones waiting to be GC'd. Shown as byte |
cassandra.view_lock_acquire_time.75th_percentile (gauge) | The time taken acquiring a partition lock for materialized view updates - p75. Shown as microsecond |
cassandra.view_lock_acquire_time.95th_percentile (gauge) | The time taken acquiring a partition lock for materialized view updates - p95. Shown as microsecond |
cassandra.view_lock_acquire_time.one_minute_rate (gauge) | The number of requests to acquire a partition lock for materialized view updates. Shown as request |
cassandra.view_read_time.75th_percentile (gauge) | The time taken during the local read of a materialized view update - p75. Shown as microsecond |
cassandra.view_read_time.95th_percentile (gauge) | The time taken during the local read of a materialized view update - p95. Shown as microsecond |
cassandra.view_read_time.one_minute_rate (gauge) | The number of local reads for materialized view updates. Shown as request |
cassandra.waiting_on_free_memtable_space.75th_percentile (gauge) | The time spent waiting for free memtable space either on- or off-heap - p75. Shown as microsecond |
cassandra.waiting_on_free_memtable_space.95th_percentile (gauge) | The time spent waiting for free memtable space either on- or off-heap - p95. Shown as microsecond |
cassandra.write_latency.75th_percentile (gauge) | The local write latency - p75. Shown as microsecond |
cassandra.write_latency.95th_percentile (gauge) | The local write latency - p95. Shown as microsecond |
cassandra.write_latency.99th_percentile (gauge) | The local write latency - p99. Shown as microsecond |
cassandra.write_latency.one_minute_rate (gauge) | The number of local write requests. Shown as write |
Cassandra 점검은 이벤트를 포함하지 않습니다.
cassandra.can_connect
Returns CRITICAL
if the Agent is unable to connect to and collect metrics from the monitored Cassandra instance, WARNING
if no metrics are collected, and OK
otherwise.
Statuses: ok, critical, warning
도움이 필요하신가요? Datadog 지원팀에 문의하세요.
이 점검은 jmx 통합을 통해 사용할 수 없는 Cassandra 클러스터에 대한 메트릭을 수집합니다. nodetool
유틸리티를 사용해 수집하세요.
Cassandra Nodetool 점검은 Datadog 에이전트 패키지에 포함되어 있으므로 Cassandra 노드에 아무 것도 설치할 필요가 없습니다.
아래 지침에 따라 호스트에서 실행되는 에이전트에 대해 이 점검을 설정하세요. 컨테이너화된 환경의 경우 컨테이너화 섹션을 참조하세요.
에이전트 설정 디렉터리 루트에 있는 conf.d/
폴더에서 cassandra_nodetool.d/conf.yaml
파일을 편집하세요. 사용 가능한 모든 설정 옵션은 sample cassandra_nodetool.d/conf.yaml을 참조하세요.
init_config:
instances:
## @param keyspaces - list of string - required
## The list of keyspaces to monitor.
## An empty list results in no metrics being sent.
#
- keyspaces:
- "<KEYSPACE_1>"
- "<KEYSPACE_2>"
Agent를 다시 시작합니다.
Cassandra 통합이 Cassandra Nodetool 로그를 수집합니다. Cassandra 로그 수집 지침을 참조하세요.
컨테이너화된 환경의 경우 포드에서 공식 프로모테우스 엑스포터를 사용하세요. 그런 다음 에이전트의 자동탐지를 사용해 포드를 찾고 엔드포인트를 쿼리하세요.
에이전트의 status
하위 명령을 실행하고 점검 섹션 아래에서 cassandra_nodetool
를 찾습니다.
cassandra.nodetool.status.load (gauge) | Amount of file system data under the cassandra data directory without snapshot content Shown as byte |
cassandra.nodetool.status.owns (gauge) | Percentage of the data owned by the node per datacenter times the replication factor Shown as percent |
cassandra.nodetool.status.replication_availability (gauge) | Percentage of data available per keyspace times replication factor Shown as percent |
cassandra.nodetool.status.replication_factor (gauge) | Replication factor per keyspace |
cassandra.nodetool.status.status (gauge) | Node status: up (1) or down (0) |
Cassandra_nodetool 점검은 이벤트를 포함하지 않습니다.
cassandra.nodetool.node_up
The agent sends this service check for each node of the monitored cluster. Returns CRITICAL if the node is down, otherwise OK.
Statuses: ok, critical
도움이 필요하신가요? Datadog 지원팀에 문의하세요.