- 重要な情報
- はじめに
- 用語集
- エージェント
- インテグレーション
- OpenTelemetry
- 開発者
- API
- CoScreen
- アプリ内
- インフラストラクチャー
- アプリケーションパフォーマンス
- 継続的インテグレーション
- ログ管理
- セキュリティ
- UX モニタリング
- 管理
Supported OS
Cassandra からメトリクスをリアルタイムに取得すると、以下のことができます。
Cassandra チェックは Datadog Agent パッケージに含まれています。Cassandra ノードに追加でインストールする必要はありません。このインテグレーションには、Oracle の JDK を使用することをお勧めします。
注: このチェックでは、インスタンスあたりのメトリクス数が 350 に制限されています。返されたメトリクスの数は、情報ページに表示されます。以下で説明する構成を編集することで、関心があるメトリクスを指定できます。収集するメトリクスをカスタマイズする方法については、JMX のドキュメントで詳細な手順を参照してください。制限以上のメトリクスを監視する必要がある場合は、Datadog のサポートチームまでお問い合わせください。
cassandra.d/conf.yaml
ファイルは、デフォルトの構成で、Cassandra メトリクスの収集が有効になっています。使用可能なすべての構成オプションの詳細については、サンプル cassandra.d/conf.yaml を参照してください。
Agent バージョン 6.0 以降で利用可能
Datadog Agent で、ログの収集はデフォルトで無効になっています。以下のように、datadog.yaml
ファイルでこれを有効にします。
logs_enabled: true
Cassandra のログの収集を開始するには、次の構成ブロックを cassandra.d/conf.yaml
ファイルに追加します。
logs:
- type: file
path: /var/log/cassandra/*.log
source: cassandra
service: myapplication
log_processing_rules:
- type: multi_line
name: log_start_with_date
# pattern to match: DEBUG [ScheduledTasks:1] 2019-12-30
pattern: '[A-Z]+ +\[[^\]]+\] +\d{4}-\d{2}-\d{2}'
path
パラメーターと service
パラメーターの値を変更し、環境に合わせて構成してください。使用可能なすべての構成オプションの詳細については、サンプル cassandra.d/conf.yaml を参照してください。
スタックトレースが単一のログとして適切に集計されるようにするために、複数行の処理ルールを追加できます。
Agent の status サブコマンドを実行し、Checks セクションで cassandra
を探します。
cassandra.active_tasks (gauge) | The number of tasks that the thread pool is actively executing. Shown as task |
cassandra.bloom_filter_false_ratio (gauge) | The ratio of Bloom filter false positives to total checks. Shown as fraction |
cassandra.bytes_flushed.count (gauge) | The amount of data that was flushed since (re)start. Shown as byte |
cassandra.cas_commit_latency.75th_percentile (gauge) | The latency of paxos commit round - p75. Shown as microsecond |
cassandra.cas_commit_latency.95th_percentile (gauge) | The latency of paxos commit round - p95. Shown as microsecond |
cassandra.cas_commit_latency.one_minute_rate (gauge) | The number of paxos commit round per second. Shown as operation |
cassandra.cas_prepare_latency.75th_percentile (gauge) | The latency of paxos prepare round - p75. Shown as microsecond |
cassandra.cas_prepare_latency.95th_percentile (gauge) | The latency of paxos prepare round - p95. Shown as microsecond |
cassandra.cas_prepare_latency.one_minute_rate (gauge) | The number of paxos prepare round per second. Shown as operation |
cassandra.cas_propose_latency.75th_percentile (gauge) | The latency of paxos propose round - p75. Shown as microsecond |
cassandra.cas_propose_latency.95th_percentile (gauge) | The latency of paxos propose round - p95. Shown as microsecond |
cassandra.cas_propose_latency.one_minute_rate (gauge) | The number of paxos propose round per second. Shown as operation |
cassandra.col_update_time_delta_histogram.75th_percentile (gauge) | The column update time delta - p75. Shown as microsecond |
cassandra.col_update_time_delta_histogram.95th_percentile (gauge) | The column update time delta - p95. Shown as microsecond |
cassandra.col_update_time_delta_histogram.min (gauge) | The column update time delta - min. Shown as microsecond |
cassandra.compaction_bytes_written.count (gauge) | The amount of data that was compacted since (re)start. Shown as byte |
cassandra.compression_ratio (gauge) | The compression ratio for all SSTables. /!\ A low value means a high compression contrary to what the name suggests. Formula used is: 'size of the compressed SSTable / size of original' Shown as fraction |
cassandra.currently_blocked_tasks (gauge) | The number of currently blocked tasks for the thread pool. Shown as task |
cassandra.currently_blocked_tasks.count (gauge) | The number of currently blocked tasks for the thread pool. Shown as task |
cassandra.db.droppable_tombstone_ratio (gauge) | The estimate of the droppable tombstone ratio. Shown as fraction |
cassandra.dropped.one_minute_rate (gauge) | The tasks dropped during execution for the thread pool. Shown as thread |
cassandra.exceptions.count (gauge) | The number of exceptions thrown from 'Storage' metrics. Shown as error |
cassandra.key_cache_hit_rate (gauge) | The key cache hit rate. Shown as fraction |
cassandra.latency.75th_percentile (gauge) | The client request latency - p75. Shown as microsecond |
cassandra.latency.95th_percentile (gauge) | The client request latency - p95. Shown as microsecond |
cassandra.latency.one_minute_rate (gauge) | The number of client requests. Shown as request |
cassandra.live_disk_space_used.count (gauge) | The disk space used by "live" SSTables (only counts in use files). Shown as byte |
cassandra.live_ss_table_count (gauge) | Number of "live" (in use) SSTables. Shown as file |
cassandra.load.count (gauge) | The disk space used by live data on a node. Shown as byte |
cassandra.max_partition_size (gauge) | The size of the largest compacted partition. Shown as byte |
cassandra.max_row_size (gauge) | The size of the largest compacted row. Shown as byte |
cassandra.mean_partition_size (gauge) | The average size of compacted partition. Shown as byte |
cassandra.mean_row_size (gauge) | The average size of compacted rows. Shown as byte |
cassandra.net.down_endpoint_count (gauge) | The number of unhealthy nodes in the cluster. They represent each individual node's view of the cluster and thus should not be summed across reporting nodes. Shown as node |
cassandra.net.up_endpoint_count (gauge) | The number of healthy nodes in the cluster. They represent each individual node's view of the cluster and thus should not be summed across reporting nodes. Shown as node |
cassandra.pending_compactions (gauge) | The number of pending compactions. Shown as task |
cassandra.pending_flushes.count (gauge) | The number of pending flushes. Shown as flush |
cassandra.pending_tasks (gauge) | The number of pending tasks for the thread pool. Shown as task |
cassandra.range_latency.75th_percentile (gauge) | The local range request latency - p75. Shown as microsecond |
cassandra.range_latency.95th_percentile (gauge) | The local range request latency - p95. Shown as microsecond |
cassandra.range_latency.one_minute_rate (gauge) | The number of local range requests. Shown as request |
cassandra.read_latency.75th_percentile (gauge) | The local read latency - p75. Shown as microsecond |
cassandra.read_latency.95th_percentile (gauge) | The local read latency - p95. Shown as microsecond |
cassandra.read_latency.99th_percentile (gauge) | The local read latency - p99. Shown as microsecond |
cassandra.read_latency.one_minute_rate (gauge) | The number of local read requests. Shown as read |
cassandra.row_cache_hit_out_of_range.count (gauge) | The number of row cache hits that do not satisfy the query filter and went to disk. Shown as hit |
cassandra.row_cache_hit.count (gauge) | The number of row cache hits. Shown as hit |
cassandra.row_cache_miss.count (gauge) | The number of table row cache misses. Shown as miss |
cassandra.snapshots_size (gauge) | The disk space truly used by snapshots. Shown as byte |
cassandra.ss_tables_per_read_histogram.75th_percentile (gauge) | The number of SSTable data files accessed per read - p75. Shown as file |
cassandra.ss_tables_per_read_histogram.95th_percentile (gauge) | The number of SSTable data files accessed per read - p95. Shown as file |
cassandra.timeouts.count (gauge) | Count of requests not acknowledged within configurable timeout window. Shown as timeout |
cassandra.timeouts.one_minute_rate (gauge) | Recent timeout rate, as an exponentially weighted moving average over a one-minute interval. Shown as timeout |
cassandra.tombstone_scanned_histogram.75th_percentile (gauge) | Number of tombstones scanned per read - p75. Shown as record |
cassandra.tombstone_scanned_histogram.95th_percentile (gauge) | Number of tombstones scanned per read - p95. Shown as record |
cassandra.total_blocked_tasks (gauge) | Total blocked tasks Shown as task |
cassandra.total_blocked_tasks.count (count) | Total count of blocked tasks Shown as task |
cassandra.total_commit_log_size (gauge) | The size used on disk by commit logs. Shown as byte |
cassandra.total_disk_space_used.count (gauge) | Total disk space used by SSTables including obsolete ones waiting to be GC'd. Shown as byte |
cassandra.view_lock_acquire_time.75th_percentile (gauge) | The time taken acquiring a partition lock for materialized view updates - p75. Shown as microsecond |
cassandra.view_lock_acquire_time.95th_percentile (gauge) | The time taken acquiring a partition lock for materialized view updates - p95. Shown as microsecond |
cassandra.view_lock_acquire_time.one_minute_rate (gauge) | The number of requests to acquire a partition lock for materialized view updates. Shown as request |
cassandra.view_read_time.75th_percentile (gauge) | The time taken during the local read of a materialized view update - p75. Shown as microsecond |
cassandra.view_read_time.95th_percentile (gauge) | The time taken during the local read of a materialized view update - p95. Shown as microsecond |
cassandra.view_read_time.one_minute_rate (gauge) | The number of local reads for materialized view updates. Shown as request |
cassandra.waiting_on_free_memtable_space.75th_percentile (gauge) | The time spent waiting for free memtable space either on- or off-heap - p75. Shown as microsecond |
cassandra.waiting_on_free_memtable_space.95th_percentile (gauge) | The time spent waiting for free memtable space either on- or off-heap - p95. Shown as microsecond |
cassandra.write_latency.75th_percentile (gauge) | The local write latency - p75. Shown as microsecond |
cassandra.write_latency.95th_percentile (gauge) | The local write latency - p95. Shown as microsecond |
cassandra.write_latency.99th_percentile (gauge) | The local write latency - p99. Shown as microsecond |
cassandra.write_latency.one_minute_rate (gauge) | The number of local write requests. Shown as write |
Cassandra チェックには、イベントは含まれません。
cassandra.can_connect
Agent が監視対象の Cassandra インスタンスに接続できず、メトリクスを収集できない場合は、CRITICAL
を返します。それ以外の場合は、OK
を返します。
Statuses: ok, クリティカル
ご不明な点は、Datadog のサポートチームまでお問合せください。
このチェックは、jmx インテグレーションでは収集できない Cassandra クラスターのメトリクスを収集します。このメトリクスの収集には nodetool
ユーティリティを使用します。
Cassandra Nodetool チェックは Datadog Agent パッケージに含まれています。Cassandra ノードに追加でインストールする必要はありません。
ホストで実行中の Agent でこのチェックを構成する場合は、以下の手順に従ってください。コンテナ環境の場合は、コンテナ化セクションを参照してください。
Agent のコンフィギュレーションディレクトリのルートにある conf.d/
フォルダーの cassandra_nodetool.d/conf.yaml
ファイルを編集します。使用可能なすべてのコンフィギュレーションオプションの詳細については、サンプル cassandra_nodetool.d/conf.yaml を参照してください。
init_config:
instances:
## @param keyspaces - list of string - required
## The list of keyspaces to monitor.
## An empty list results in no metrics being sent.
#
- keyspaces:
- "<KEYSPACE_1>"
- "<KEYSPACE_2>"
Cassandra Nodetool ログは Cassandra インテグレーションにより収集されます。Cassandra のログ収集の手順をご確認ください。
コンテナ環境では、ポッドに公式の Prometheus エクスポーターを使用し、Agent のオートディスカバリーでポッドを見つけ、エンドポイントをクエリします。
Agent の status
サブコマンドを実行し、Checks セクションで cassandra_nodetool
を探します。
cassandra.nodetool.status.replication_availability (gauge) | Percentage of data available per keyspace times replication factor Shown as percent |
cassandra.nodetool.status.replication_factor (gauge) | Replication factor per keyspace |
cassandra.nodetool.status.status (gauge) | Node status: up (1) or down (0) |
cassandra.nodetool.status.owns (gauge) | Percentage of the data owned by the node per datacenter times the replication factor Shown as percent |
cassandra.nodetool.status.load (gauge) | Amount of file system data under the cassandra data directory without snapshot content Shown as byte |
Cassandra_nodetool チェックには、イベントは含まれません。
cassandra.nodetool.node_up
The agent sends this service check for each node of the monitored cluster. Returns CRITICAL if the node is down, otherwise OK.
Statuses: ok, critical
ご不明な点は、Datadog のサポートチームまでお問合せください。