Supported OS Linux Mac OS Windows

インテグレーションバージョン3.3.1

Riak Graph

概要

このチェックを使用して、RiakKV または RiakTS から取得されるノード、vnode、およびリングのパフォーマンスメトリクスを追跡します。

セットアップ

インストール

Riak チェックは Datadog Agent パッケージに含まれています。Riak サーバーに追加でインストールする必要はありません。

コンフィギュレーション

ホスト

ホストで実行中の Agent に対してこのチェックを構成するには:

メトリクスの収集
  1. Agent のコンフィギュレーションディレクトリのルートにある conf.d/ フォルダーの riak.d/conf.yaml ファイルを編集します。使用可能なすべてのコンフィギュレーションオプションの詳細については、サンプル riak.yaml を参照してください。

    init_config:
    
    instances:
      ## @param url - string - required
      ## Riak stats url to connect to.
      #
      - url: http://127.0.0.1:8098/stats
    
  2. Agent を再起動すると、Datadog への Riak メトリクスの送信が開始されます。

ログの収集

Agent バージョン 6.0 以降で利用可能

  1. Datadog Agent で、ログの収集はデフォルトで無効になっています。以下のように、datadog.yaml ファイルでこれを有効にします。

    logs_enabled: true
    
  2. Riak のログの収集を開始するには、次の構成ブロックを riak.d/conf.yaml ファイルに追加します。

      logs:
        - type: file
          path: /var/log/riak/console.log
          source: riak
          service: "<SERVICE_NAME>"
    
        - type: file
          path: /var/log/riak/error.log
          source: riak
          service: "<SERVICE_NAME>"
          log_processing_rules:
            - type: multi_line
              name: new_log_start_with_date
              pattern: \d{4}\-\d{2}\-\d{2}
    
        - type: file
          path: /var/log/riak/crash.log
          source: riak
          service: "<SERVICE_NAME>"
          log_processing_rules:
            - type: multi_line
              name: new_log_start_with_date
              pattern: \d{4}\-\d{2}\-\d{2}
    
  3. Agent を再起動します

コンテナ化

コンテナ環境の場合は、オートディスカバリーのインテグレーションテンプレートのガイドを参照して、次のパラメーターを適用してください。

メトリクスの収集
パラメーター
<インテグレーション名>riak
<初期コンフィギュレーション>空白または {}
<インスタンスコンフィギュレーション>{"url":"http://%%host%%:8098/stats"}
ログの収集

Agent バージョン 6.0 以降で利用可能

Datadog Agent で、ログの収集はデフォルトで無効になっています。有効にする方法については、Kubernetes ログ収集を参照してください。

パラメーター
<LOG_CONFIG>{"source": "riak", "service": "riak", "log_processing_rules": {"type": "multi_line", "name": "new_log_Start_with_date", "pattern": "\d{4}\-\d{2}\-\d{2}"}}

検証

Agent の status サブコマンドを実行し、Checks セクションで riak を探します。

収集データ

メトリクス

riak.memory_atom
(gauge)
Total amount of memory currently allocated for atom storage
Shown as byte
riak.memory_atom_used
(gauge)
Total amount of memory currently used for atom storage
Shown as byte
riak.memory_binary
(gauge)
Total amount of memory used for binaries
Shown as byte
riak.memory_code
(gauge)
Total amount of memory allocated for Erlang code
Shown as byte
riak.memory_ets
(gauge)
Total memory allocated for Erlang Term Storage
Shown as byte
riak.memory_processes
(gauge)
Total amount of memory allocated for Erlang processes
Shown as byte
riak.memory_processes_used
(gauge)
Total amount of memory used by Erlang processes
Shown as byte
riak.memory_total
(gauge)
Total allocated memory (sum of processes and system)
Shown as byte
riak.node_get_fsm_active_60s
(gauge)
Number of active GET FSMs
riak.node_get_fsm_in_rate
(gauge)
Average number of GET FSMs enqueued by Sidejob
riak.node_get_fsm_objsize_100
(gauge)
Object size encountered by this node
Shown as byte
riak.node_get_fsm_objsize_95
(gauge)
Object size encountered by this node
Shown as byte
riak.node_get_fsm_objsize_99
(gauge)
Object size encountered by this node
Shown as byte
riak.node_get_fsm_objsize_mean
(gauge)
Object size encountered by this node
Shown as byte
riak.node_get_fsm_objsize_median
(gauge)
Object size encountered by this node
Shown as byte
riak.node_get_fsm_out_rate
(gauge)
Average number of GET FSMs dequeued by Sidejob
riak.node_get_fsm_rejected_60s
(gauge)
Number of GET FSMs actively being rejected by Sidejob's overload protection
riak.node_get_fsm_siblings_100
(gauge)
Number of siblings encountered during all GET operations by this node
Shown as node
riak.node_get_fsm_siblings_95
(gauge)
Number of siblings encountered during all GET operations by this node
Shown as node
riak.node_get_fsm_siblings_99
(gauge)
Number of siblings encountered during all GET operations by this node
Shown as node
riak.node_get_fsm_siblings_mean
(gauge)
Number of siblings encountered during all GET operations by this node
Shown as node
riak.node_get_fsm_siblings_median
(gauge)
Number of siblings encountered during all GET operations by this node
Shown as node
riak.node_get_fsm_time_100
(gauge)
Time between reception of client GET request and subsequent response to client
Shown as microsecond
riak.node_get_fsm_time_95
(gauge)
Time between reception of client GET request and subsequent response to client
Shown as microsecond
riak.node_get_fsm_time_99
(gauge)
Time between reception of client GET request and subsequent response to client
Shown as microsecond
riak.node_get_fsm_time_mean
(gauge)
Time between reception of client GET request and subsequent response to client
Shown as microsecond
riak.node_get_fsm_time_median
(gauge)
Time between reception of client GET request and subsequent response to client
Shown as microsecond
riak.node_gets
(count)
Number of GETs coordinated by this node
Shown as operation
riak.node_put_fsm_active_60s
(gauge)
Number of active PUT FSMs
riak.node_put_fsm_in_rate
(gauge)
Average number of PUT FSMs enqueued by Sidejob
riak.node_put_fsm_out_rate
(gauge)
Average number of PUT FSMs dequeued by Sidejob
riak.node_put_fsm_rejected_60s
(gauge)
Number of PUT FSMs actively being rejected by Sidejob's overload protection
riak.node_put_fsm_time_100
(gauge)
Time between reception of client PUT request and subsequent response to client
Shown as microsecond
riak.node_put_fsm_time_95
(gauge)
Time between reception of client PUT request and subsequent response to client
Shown as microsecond
riak.node_put_fsm_time_99
(gauge)
Time between reception of client PUT request and subsequent response to client
Shown as microsecond
riak.node_put_fsm_time_mean
(gauge)
Time between reception of client PUT request and subsequent response to client
Shown as microsecond
riak.node_put_fsm_time_median
(gauge)
Time between reception of client PUT request and subsequent response to client
Shown as microsecond
riak.node_puts
(gauge)
Number of PUTs coordinated by this node
Shown as operation
riak.pbc_active
(gauge)
Number of active protocol buffers connections
Shown as connection
riak.pbc_connects
(gauge)
Number of protocol buffers connections
Shown as connection
riak.read_repairs
(gauge)
Number of read repair operations this this node has coordinated in the last minute
Shown as operation
riak.search_index_fail_count
(gauge)
Total number of documents that have failed to index
Shown as object
riak.search_index_fail_one
(gauge)
Number of documents that have failed to index in the past one minute
Shown as object
riak.search_index_latency_95
(gauge)
Time between insertion of document and it being indexed: 95th percentile
Shown as microsecond
riak.search_index_latency_99
(gauge)
Time between insertion of document and it being indexed: 99th percentile
Shown as microsecond
riak.search_index_latency_999
(gauge)
Time between insertion of document and it being indexed: 99.9th percentile
Shown as microsecond
riak.search_index_latency_max
(gauge)
Time between insertion of document and it being indexed: max
Shown as microsecond
riak.search_index_latency_mean
(gauge)
Time between insertion of document and it being indexed: mean
Shown as microsecond
riak.search_index_latency_median
(gauge)
Time between insertion of document and it being indexed: median
Shown as microsecond
riak.search_index_latency_min
(gauge)
Time between insertion of document and it being indexed: min
Shown as microsecond
riak.search_index_throughput_count
(gauge)
Total number of documents that have been indexed
Shown as operation
riak.search_index_throughput_one
(gauge)
Number of documents that have been indexed in the last one minute
Shown as operation
riak.search_query_fail_count
(gauge)
Total number of queries that have failed
Shown as event
riak.search_query_fail_one
(gauge)
Number of queries that have failed in the last one minute
Shown as event
riak.search_query_latency_95
(gauge)
Time between reception of query and response: 95th percentile
Shown as microsecond
riak.search_query_latency_99
(gauge)
Time between reception of query and response: 99th percentile
Shown as microsecond
riak.search_query_latency_999
(gauge)
Time between reception of query and response: 99.9th percentile
Shown as microsecond
riak.search_query_latency_max
(gauge)
Time between reception of query and response: max
Shown as microsecond
riak.search_query_latency_mean
(gauge)
Time between reception of query and response: mean
Shown as microsecond
riak.search_query_latency_median
(gauge)
Time between reception of query and response: median
Shown as microsecond
riak.search_query_latency_min
(gauge)
Time between reception of query and response: min
Shown as microsecond
riak.search_query_throughput_count
(gauge)
Total number of queries that have been performed
Shown as operation
riak.search_query_throughput_one
(gauge)
Number of searches that have been performed in the last one minute
Shown as operation
riak.vnode_gets
(gauge)
Number of GET operations coordinated by vnodes on this node
Shown as operation
riak.vnode_index_deletes
(gauge)
Number of vnode index delete operations
Shown as operation
riak.vnode_index_reads
(gauge)
Number of vnode index read operations
Shown as read
riak.vnode_index_writes
(gauge)
Number of vnode index write operations
Shown as write
riak.vnode_puts
(count)
Number of PUT operations coordinated by vnodes on this node
Shown as operation

イベント

Riak チェックには、イベントは含まれません。

サービスのチェック

riak.can_connect
Agent が監視対象の Riak インスタンスに接続できない場合は、CRITICAL を返します。それ以外の場合は、OK を返します。
Statuses: ok, クリティカル

トラブルシューティング

ご不明な点は、Datadog のサポートチームまでお問合せください。