Consul

Supported OS Linux Windows

Integrationv2.2.0

Consul ダッシュ

概要

Datadog Agent は、以下のような多くのメトリクスを Consul ノードから収集します。

  • Consul ピアの合計数
  • サービス健全性 - 特定のサービスに対して、UP、PASSING、WARNING、CRITICAL なノードの数
  • ノード健全性 - 特定のノードに対して、UP、PASSING、WARNING、CRITICAL なサービスの数
  • ネットワーク座標系 - データセンター間およびデータセンター内のレイテンシー

Consul Agent は DogStatsD を使ってさらに多くのメトリクスを提供できます。これらは、Consul に依存するサービスではなく、Consul 自体の内部健全性に関連するメトリクスです。以下のメトリクスがあります。

  • Serf のイベントとメンバーフラップ
  • Raft プロトコル
  • DNS パフォーマンス

その他にも多数あります。

メトリクスに加えて、Datadog Agent は Consul の健全性チェックごとにサービスチェックを送信し、新しいリーダー選出ごとにイベントを送信します。

セットアップ

インストール

Datadog Agent の Consul チェックは Datadog Agent パッケージに含まれています。Consul ノードに追加でインストールする必要はありません。

コンフィギュレーション

ホスト

ホストで実行中の Agent に対してこのチェックを構成するには:

メトリクスの収集
  1. Consul のメトリクスの収集を開始するには、Agent のコンフィギュレーションディレクトリのルートにある conf.d/ フォルダーの consul.d/conf.yaml ファイルを編集します。使用可能なすべてのコンフィギュレーションオプションについては、サンプル consul.d/conf.yaml を参照してください。

    init_config:
    
    instances:
      ## @param url - string - required
      ## Where your Consul HTTP server lives,
      ## point the URL at the leader to get metrics about your Consul cluster.
      ## Use HTTPS instead of HTTP if your Consul setup is configured to do so.
      #
      - url: http://localhost:8500
    
  2. Agent を再起動します

OpenMetrics

オプションで、use_prometheus_endpoint コンフィギュレーションオプションを有効にして、Consul Prometheus エンドポイントから追加のメトリクスセットを取得できます。

: DogStatsD または Prometheus メソッドを使用し、同じインスタンスに両方を有効化しないようご注意ください。

  1. Consul を構成し、Prometheus のエンドポイントにメトリクスを公開します。prometheus_retention_time を、メインの Consul コンフィギュレーションファイルの最上位レベルの telemetry キーにネストするよう設定します。

    {
      ...
      "telemetry": {
        "prometheus_retention_time": "360h"
      },
      ...
    }
    
  2. Prometheus エンドポイントの使用を開始するには、Agent のコンフィギュレーションディレクトリのルートにある conf.d/ フォルダーで consul.d/conf.yaml ファイルを編集します。

    instances:
        - url: <EXAMPLE>
          use_prometheus_endpoint: true
    
  3. Agent を再起動します

DogStatsD

Prometheus エンドポイントを使用する代わりに、DogStatsD を介して同じ追加メトリクスのセットを Agent に送信するように Consul を構成できます。

  1. Consul のメインのコンフィギュレーションファイルで、最上位レベルの telemetry キーの下にネストした dogstatsd_addr を追加することで、DogStatsD メトリクスを送信するよう Consul を構成します。

    {
      ...
      "telemetry": {
        "dogstatsd_addr": "127.0.0.1:8125"
      },
      ...
    }
    
  2. メトリクスが正しくタグ付けされるよう下記のコンフィギュレーションを追加し、Datadog Agent のメインコンフィギュレーションファイルである datadog.yaml を更新します。

    # dogstatsd_mapper_cache_size: 1000  # default to 1000
    dogstatsd_mapper_profiles:
      - name: consul
        prefix: "consul."
        mappings:
          - match: 'consul\.http\.([a-zA-Z]+)\.(.*)'
            match_type: "regex"
            name: "consul.http.request"
            tags:
              method: "$1"
              path: "$2"
          - match: 'consul\.raft\.replication\.appendEntries\.logs\.([0-9a-f-]+)'
            match_type: "regex"
            name: "consul.raft.replication.appendEntries.logs"
            tags:
              peer_id: "$1"
          - match: 'consul\.raft\.replication\.appendEntries\.rpc\.([0-9a-f-]+)'
            match_type: "regex"
            name: "consul.raft.replication.appendEntries.rpc"
            tags:
              peer_id: "$1"
          - match: 'consul\.raft\.replication\.heartbeat\.([0-9a-f-]+)'
            match_type: "regex"
            name: "consul.raft.replication.heartbeat"
            tags:
              peer_id: "$1"
    
  3. Agent を再起動します

ログの収集

Agent バージョン 6.0 以降で利用可能

  1. Datadog Agent で、ログの収集はデフォルトで無効になっています。以下のように、datadog.yaml ファイルでこれを有効にします。

    logs_enabled: true
    
  2. consul.yaml ファイルでこのコンフィギュレーションブロックを編集して、Consul ログを収集します。

    logs:
      - type: file
        path: /var/log/consul_server.log
        source: consul
        service: myservice
    

    path パラメーターと service パラメーターの値を変更し、環境に合わせて構成してください。 使用可能なすべてのコンフィギュレーションオプションについては、サンプル consul.d/conf.yaml を参照してください。

  3. Agent を再起動します

コンテナ化

コンテナ環境の場合は、オートディスカバリーのインテグレーションテンプレートのガイドを参照して、次のパラメーターを適用してください。

メトリクスの収集
パラメーター
<インテグレーション名>consul
<初期コンフィギュレーション>空白または {}
<インスタンスコンフィギュレーション>{"url": "https://%%host%%:8500"}
ログの収集

Agent バージョン 6.0 以降で利用可能

Datadog Agent で、ログの収集はデフォルトで無効になっています。有効にする方法については、Kubernetes ログ収集を参照してください。

パラメーター
<LOG_CONFIG>{"source": "consul", "service": "<サービス名>"}

検証

Agent の status サブコマンドを実行し、Checks セクションで consul を探します。

: Consul ノードでデバッグログが有効になっている場合は、Datadog Agent の通常のポーリングが Consul ログに以下を表示します。

2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/status/leader (59.344us) from=127.0.0.1:53768
2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/status/peers (62.678us) from=127.0.0.1:53770
2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/health/state/any (106.725us) from=127.0.0.1:53772
2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/catalog/services (79.657us) from=127.0.0.1:53774
2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/health/service/consul (153.917us) from=127.0.0.1:53776
2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/coordinate/datacenters (71.778us) from=127.0.0.1:53778
2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/coordinate/nodes (84.95us) from=127.0.0.1:53780

Consul Agent から DogStatsD へ

netstat を使用して、Consul のメトリクスも送信されていることを確認します。

$ sudo netstat -nup | grep "127.0.0.1:8125.*ESTABLISHED"
udp        0      0 127.0.0.1:53874         127.0.0.1:8125          ESTABLISHED 23176/consul

収集データ

メトリクス

consul.catalog.nodes_critical
(gauge)
[Integration] The number of nodes with service status critical from those registered
Shown as node
consul.catalog.nodes_passing
(gauge)
[Integration] The number of nodes with service status passing from those registered
Shown as node
consul.catalog.nodes_up
(gauge)
[Integration] The number of nodes
Shown as node
consul.catalog.nodes_warning
(gauge)
[Integration] The number of nodes with service status warning from those registered
Shown as node
consul.catalog.total_nodes
(gauge)
[Integration] The number of nodes registered in the consul cluster
Shown as node
consul.catalog.services_critical
(gauge)
[Integration] Total critical services on nodes
Shown as service
consul.catalog.services_passing
(gauge)
[Integration] Total passing services on nodes
Shown as service
consul.catalog.services_up
(gauge)
[Integration] Total services registered on nodes
Shown as service
consul.catalog.services_warning
(gauge)
[Integration] Total warning services on nodes
Shown as service
consul.catalog.services_count
(gauge)
[Integration] Metrics to count the number of services matching criteria like the service tag, node name, or status. To be queried using the sum by aggregator.
Shown as service
consul.net.node.latency.min
(gauge)
[Integration] Minimum latency from this node to all others
Shown as millisecond
consul.net.node.latency.p25
(gauge)
[Integration] P25 latency from this node to all others
Shown as millisecond
consul.net.node.latency.median
(gauge)
[Integration] Median latency from this node to all others
Shown as millisecond
consul.net.node.latency.p75
(gauge)
[Integration] P75 latency from this node to all others
Shown as millisecond
consul.net.node.latency.p90
(gauge)
[Integration] P90 latency from this node to all others
Shown as millisecond
consul.net.node.latency.p95
(gauge)
[Integration] P95 latency from this node to all others
Shown as millisecond
consul.net.node.latency.p99
(gauge)
[Integration] P99 latency from this node to all others
Shown as millisecond
consul.net.node.latency.max
(gauge)
[Integration] Maximum latency from this node to all others
Shown as millisecond
consul.peers
(gauge)
[Integration] The number of peers in the peer set
consul.client.rpc
(count)
[DogStatsD] [Prometheus] This increments whenever a Consul agent in client mode makes an RPC request to a Consul server. This gives a measure of how much a given agent is loading the Consul servers. This is only generated by agents in client mode, not Consul servers.
Shown as request
consul.client.rpc.failed
(count)
[DogStatsD] [Prometheus] Increments whenever a Consul agent in client mode makes an RPC request to a Consul server and fails
Shown as request
consul.http.request
(gauge)
[DogStatsD] Tracks how long it takes to service the given HTTP request for the given verb and path. Using a DogStatsD mapper as described in the README, the paths are mapped to tags and do not include details like service or key names. For these paths, an underscore is present as a placeholder, for example: http_method:GET, path:v1.kv._)
Shown as millisecond
consul.http.request.count
(count)
[Prometheus] A count of how long it takes to service the given HTTP request for the given verb and path. It includes labels for path and method. Path does not include details like service or key names. For these paths, an underscore is present as a placeholder, for example: path=v1.kv._)
Shown as millisecond
consul.http.request.quantile
(gauge)
[Prometheus] A quantile of how long it takes to service the given HTTP request for the given verb and path. Includes labels for path and method. Path does not include details like service or key names. For these paths, an underscore is present as a placeholder, for example: path=v1.kv._)
Shown as millisecond
consul.http.request.sum
(count)
[Prometheus] The sum of how long it takes to service the given HTTP request for the given verb and path. Includes labels for path and method. Path does not include details like service or key names. For these paths, an underscore is present as a placeholder, for example: path=v1.kv._)
Shown as millisecond
consul.memberlist.degraded.probe
(gauge)
[DogStatsD] [Prometheus] This metric counts the number of times the Consul agent has performed failure detection on another agent at a slower probe rate. The agent uses its own health metric as an indicator to perform this action. If its health score is low, it means that the node is healthy, and vice versa.
consul.memberlist.gossip.95percentile
(gauge)
[DogStatsD] The p95 for the number of gossips (messages) broadcasted to a set of randomly selected nodes.
Shown as message
consul.memberlist.gossip.avg
(gauge)
[DogStatsD] The avg for the number of gossips (messages) broadcasted to a set of randomly selected nodes.
Shown as message
consul.memberlist.gossip.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.memberlist.gossip
consul.memberlist.gossip.max
(gauge)
[DogStatsD] The max for the number of gossips (messages) broadcasted to a set of randomly selected nodes.
Shown as message
consul.memberlist.gossip.median
(gauge)
[DogStatsD] The median for the number of gossips (messages) broadcasted to a set of randomly selected nodes.
Shown as message
consul.memberlist.gossip.quantile
(gauge)
[Prometheus] The quantile for the number of gossips (messages) broadcasted to a set of randomly selected nodes.
Shown as message
consul.memberlist.gossip.sum
(count)
[DogStatsD] [Prometheus] The sum of the number of gossips (messages) broadcasted to a set of randomly selected nodes.
Shown as message
consul.memberlist.health.score
(gauge)
[DogStatsD] [Prometheus] This metric describes a node's perception of its own health based on how well it is meeting the soft real-time requirements of the protocol. This metric ranges from 0 to 8, where 0 indicates "totally healthy". For more details see section IV of the Lifeguard paper: https://arxiv.org/pdf/1707.00788.pdf
consul.memberlist.msg.alive
(count)
[DogStatsD] [Prometheus] This metric counts the number of alive Consul agents, that the agent has mapped out so far, based on the message information given by the network layer.
consul.memberlist.msg.dead
(count)
[DogStatsD] [Prometheus] This metric counts the number of times a Consul agent has marked another agent to be a dead node.
Shown as message
consul.memberlist.msg.suspect
(count)
[DogStatsD] [Prometheus] The number of times a Consul agent suspects another as failed while probing during gossip protocol
consul.memberlist.probenode.95percentile
(gauge)
[DogStatsD] The p95 for the time taken to perform a single round of failure detection on a select Consul agent.
Shown as node
consul.memberlist.probenode.avg
(gauge)
[DogStatsD] The avg for the time taken to perform a single round of failure detection on a select Consul agent.
Shown as node
consul.memberlist.probenode.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.memberlist.probenode
consul.memberlist.probenode.max
(gauge)
[DogStatsD] The max for the time taken to perform a single round of failure detection on a select Consul agent.
Shown as node
consul.memberlist.probenode.median
(gauge)
[DogStatsD] The median for the time taken to perform a single round of failure detection on a select Consul agent.
Shown as node
consul.memberlist.probenode.quantile
(gauge)
[Prometheus] The quantile for the time taken to perform a single round of failure detection on a select Consul agent.
Shown as node
consul.memberlist.probenode.sum
(count)
[DogStatsD] [Prometheus] The sum for the time taken to perform a single round of failure detection on a select Consul agent.
Shown as node
consul.memberlist.pushpullnode.95percentile
(gauge)
[DogStatsD] The p95 for the number of Consul agents that have exchanged state with this agent.
Shown as node
consul.memberlist.pushpullnode.avg
(gauge)
[DogStatsD] The avg for the number of Consul agents that have exchanged state with this agent.
Shown as node
consul.memberlist.pushpullnode.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.memberlist.pushpullnode
consul.memberlist.pushpullnode.max
(gauge)
[DogStatsD] The max for the number of Consul agents that have exchanged state with this agent.
Shown as node
consul.memberlist.pushpullnode.median
(gauge)
[DogStatsD] The median for the number of Consul agents that have exchanged state with this agent.
Shown as node
consul.memberlist.pushpullnode.quantile
(gauge)
[Prometheus] The quantile for the number of Consul agents that have exchanged state with this agent.
consul.memberlist.pushpullnode.sum
(count)
[DogStatsD] [Prometheus] The sum for the number of Consul agents that have exchanged state with this agent.
consul.memberlist.tcp.accept
(count)
[DogStatsD] [Prometheus] This metric counts the number of times a Consul agent has accepted an incoming TCP stream connection.
Shown as connection
consul.memberlist.tcp.connect
(count)
[DogStatsD] [Prometheus] This metric counts the number of times a Consul agent has initiated a push/pull sync with an other agent.
Shown as connection
consul.memberlist.tcp.sent
(count)
[DogStatsD] [Prometheus] This metric measures the total number of bytes sent by a Consul agent through the TCP protocol
Shown as byte
consul.memberlist.udp.received
(count)
[DogStatsD] [Prometheus] This metric measures the total number of bytes sent/received by a Consul agent through the UDP protocol.
Shown as byte
consul.memberlist.udp.sent
(count)
[DogStatsD] [Prometheus] This metric measures the total number of bytes sent/received by a Consul agent through the UDP protocol.
Shown as byte
consul.raft.apply
(count)
[DogStatsD] [Prometheus] The number of raft transactions occurring
Shown as transaction
consul.raft.commitTime.95percentile
(gauge)
[DogStatsD] The p95 time it takes to commit a new entry to the raft log on the leader
Shown as millisecond
consul.raft.commitTime.avg
(gauge)
[DogStatsD] The average time it takes to commit a new entry to the raft log on the leader
Shown as millisecond
consul.raft.commitTime.count
(count)
[DogStatsD] [Prometheus] The number of samples of raft.commitTime
consul.raft.commitTime.max
(gauge)
[DogStatsD] The max time it takes to commit a new entry to the raft log on the leader
Shown as millisecond
consul.raft.commitTime.median
(gauge)
[DogStatsD] The median time it takes to commit a new entry to the raft log on the leader
Shown as millisecond
consul.raft.commitTime.quantile
(gauge)
[Prometheus] The quantile time it takes to commit a new entry to the raft log on the leader
Shown as millisecond
consul.raft.commitTime.sum
(count)
[DogStatsD] [Prometheus] The sum of the time it takes to commit a new entry to the raft log on the leader
Shown as millisecond
consul.raft.leader.dispatchLog.95percentile
(gauge)
[DogStatsD] The p95 time it takes for the leader to write log entries to disk
Shown as millisecond
consul.raft.leader.dispatchLog.avg
(gauge)
[DogStatsD] The average time it takes for the leader to write log entries to disk
Shown as millisecond
consul.raft.leader.dispatchLog.count
(count)
[DogStatsD] [Prometheus] The number of samples of raft.leader.dispatchLog
consul.raft.leader.dispatchLog.max
(gauge)
[DogStatsD] The max time it takes for the leader to write log entries to disk
Shown as millisecond
consul.raft.leader.dispatchLog.median
(gauge)
[DogStatsD] The median time it takes for the leader to write log entries to disk
Shown as millisecond
consul.raft.leader.dispatchLog.quantile
(gauge)
[Prometheus] The quantile time it takes for the leader to write log entries to disk
Shown as millisecond
consul.raft.leader.dispatchLog.sum
(count)
[DogStatsD] [Prometheus] The sum of the time it takes for the leader to write log entries to disk
Shown as millisecond
consul.raft.leader.lastContact.95percentile
(gauge)
[DogStatsD] The p95 time elapsed since the leader was last able to check its lease with followers
Shown as millisecond
consul.raft.leader.lastContact.avg
(gauge)
[DogStatsD] The average time elapsed since the leader was last able to check its lease with followers
Shown as millisecond
consul.raft.leader.lastContact.count
(count)
[DogStatsD] [Prometheus] The number of samples of raft.leader.lastContact
consul.raft.leader.lastContact.max
(gauge)
[DogStatsD] The max time elapsed since the leader was last able to check its lease with followers
Shown as millisecond
consul.raft.leader.lastContact.median
(gauge)
[DogStatsD] The median time elapsed since the leader was last able to check its lease with followers
Shown as millisecond
consul.raft.leader.lastContact.quantile
(gauge)
[Prometheus] The quantile time elapsed since the leader was last able to check its lease with followers
Shown as millisecond
consul.raft.leader.lastContact.sum
(count)
[DogStatsD] [Prometheus] The sum of the time elapsed since the leader was last able to check its lease with followers
Shown as millisecond
consul.raft.replication.appendEntries.logs
(count)
[DogStatsD] [Prometheus] Measures the number of logs replicated to an agent, to bring it up to speed with the leader's logs.
Shown as entry
consul.raft.replication.appendEntries.rpc.count
(count)
[DogStatsD] [Prometheus] The count the time taken by the append entries RFC to replicate the log entries of a leader agent onto its follower agent(s)
Shown as millisecond
consul.raft.replication.appendEntries.rpc.quantile
(gauge)
[Prometheus] The quantile of the time taken by the append entries RFC to replicate the log entries of a leader agent onto its follower agent(s)
Shown as millisecond
consul.raft.replication.appendEntries.rpc.sum
(count)
[DogStatsD] [Prometheus] The sum the time taken by the append entries RFC to replicate the log entries of a leader agent onto its follower agent(s)
Shown as millisecond
consul.raft.replication.heartbeat.count
(count)
[DogStatsD] [Prometheus] The count the time taken to invoke appendEntries on a peer.
Shown as millisecond
consul.raft.replication.heartbeat.quantile
(gauge)
[Prometheus] The quantile of the time taken to invoke appendEntries on a peer.
Shown as millisecond
consul.raft.replication.heartbeat.sum
(count)
[DogStatsD] [Prometheus] The sum of the time taken to invoke appendEntries on a peer.
Shown as millisecond
consul.raft.state.candidate
(count)
[DogStatsD] [Prometheus]The number of initiated leader elections
Shown as event
consul.raft.state.leader
(count)
[DogStatsD] [Prometheus] The number of completed leader elections
Shown as event
consul.runtime.gc_pause_ns.95percentile
(gauge)
[DogStatsD] The p95 for the number of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started.
Shown as nanosecond
consul.runtime.gc_pause_ns.avg
(gauge)
[DogStatsD] The avg for the number of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started.
Shown as nanosecond
consul.runtime.gc_pause_ns.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.runtime.gcpausens
consul.runtime.gc_pause_ns.max
(gauge)
[DogStatsD] The max for the number of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started.
Shown as nanosecond
consul.runtime.gc_pause_ns.median
(gauge)
[DogStatsD] The median for the number of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started.
Shown as nanosecond
consul.runtime.gc_pause_ns.quantile
(gauge)
[Prometheus] The quantile of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started.
Shown as nanosecond
consul.runtime.gc_pause_ns.sum
(count)
[DogStatsD] [Prometheus] The sum of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started.
Shown as nanosecond
consul.serf.coordinate.adjustment_ms.95percentile
(gauge)
[DogStatsD] The p95 in milliseconds for the node coordinate adjustment
Shown as millisecond
consul.serf.coordinate.adjustment_ms.avg
(gauge)
[DogStatsD] The avg in milliseconds for the node coordinate adjustment
Shown as millisecond
consul.serf.coordinate.adjustment_ms.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.serf.coordinate.adjustment_ms
consul.serf.coordinate.adjustment_ms.max
(gauge)
[DogStatsD] The max in milliseconds for the node coordinate adjustment
Shown as millisecond
consul.serf.coordinate.adjustment_ms.median
(gauge)
[DogStatsD] The median in milliseconds for the node coordinate adjustment
Shown as millisecond
consul.serf.coordinate.adjustment_ms.quantile
(gauge)
[Prometheus] The quantile in milliseconds for the node coordinate adjustment
Shown as millisecond
consul.serf.coordinate.adjustment_ms.sum
(count)
[DogStatsD] [Prometheus] The sum in milliseconds for the node coordinate adjustment
Shown as millisecond
consul.serf.events
(count)
[DogStatsD] [Prometheus] This increments when a Consul agent processes a serf event
Shown as event
consul.serf.member.failed
(count)
[DogStatsD] [Prometheus] This increments when a Consul agent is marked dead. This can be an indicator of overloaded agents, network problems, or configuration errors where agents cannot connect to each other on the required ports.
consul.serf.member.flap
(count)
[DogStatsD] [Prometheus] The number of times a Consul agent is marked dead and then quickly recovers
consul.serf.member.join
(count)
[DogStatsD] [Prometheus] This increments when a Consul agent processes a join event
Shown as event
consul.serf.member.left
(count)
[DogStatsD] [Prometheus] This increments when a Consul agent leaves the cluster.
consul.serf.member.update
(count)
[DogStatsD] [Prometheus] This increments when a Consul agent updates.
consul.serf.msgs.received.95percentile
(gauge)
[DogStatsD] The p95 for the number of serf messages received
Shown as message
consul.serf.msgs.received.avg
(gauge)
[DogStatsD] The avg for the number of serf messages received
Shown as message
consul.serf.msgs.received.count
(count)
[DogStatsD] [Prometheus] The count of serf messages received
consul.serf.msgs.received.max
(gauge)
[DogStatsD] The max for the number of serf messages received
Shown as message
consul.serf.msgs.received.median
(gauge)
[DogStatsD] The median for the number of serf messages received
Shown as message
consul.serf.msgs.received.quantile
(gauge)
[Prometheus] The quantile for the number of serf messages received
Shown as message
consul.serf.msgs.received.sum
(count)
[DogStatsD] [Prometheus] The sum for the number of serf messages received
Shown as message
consul.serf.msgs.sent.95percentile
(gauge)
[DogStatsD] The p95 for the number of serf messages sent
Shown as message
consul.serf.msgs.sent.avg
(gauge)
[DogStatsD] The avg for the number of serf messages sent
Shown as message
consul.serf.msgs.sent.count
(count)
[DogStatsD] [Prometheus] The count of serf messages sent
consul.serf.msgs.sent.max
(gauge)
[DogStatsD] The max for the number of serf messages sent
Shown as message
consul.serf.msgs.sent.median
(gauge)
[DogStatsD] The median for the number of serf messages sent
Shown as message
consul.serf.msgs.sent.quantile
(gauge)
[Prometheus] The quantile for the number of serf messages sent
Shown as message
consul.serf.msgs.sent.sum
(count)
[DogStatsD] [Prometheus] The sum of the number of serf messages sent
Shown as message
consul.serf.queue.event.95percentile
(gauge)
[DogStatsD] The p95 for the size of the serf event queue
consul.serf.queue.event.avg
(gauge)
[DogStatsD] The avg size of the serf event queue
consul.serf.queue.event.count
(count)
[DogStatsD] [Prometheus] The number of items in the serf event queue
consul.serf.queue.event.max
(gauge)
[DogStatsD] The max size of the serf event queue
consul.serf.queue.event.median
(gauge)
[DogStatsD] The median size of the serf event queue
consul.serf.queue.event.quantile
(gauge)
[Prometheus] The quantile for the size of the serf event queue
consul.serf.queue.intent.95percentile
(gauge)
[DogStatsD] The p95 for the size of the serf intent queue
consul.serf.queue.intent.avg
(gauge)
[DogStatsD] The avg size of the serf intent queue
consul.serf.queue.intent.count
(count)
[DogStatsD] [Prometheus] The number of items in the serf intent queue
consul.serf.queue.intent.max
(gauge)
[DogStatsD] The max size of the serf intent queue
consul.serf.queue.intent.median
(gauge)
[DogStatsD] The median size of the serf intent queue
consul.serf.queue.intent.quantile
(gauge)
[Prometheus] The quantile for the size of the serf intent queue
consul.serf.queue.query.95percentile
(gauge)
[DogStatsD] The p95 for the size of the serf query queue
consul.serf.queue.query.avg
(gauge)
[DogStatsD] The avg size of the serf query queue
consul.serf.queue.query.count
(count)
[DogStatsD] [Prometheus] The number of items in the serf query queue
consul.serf.queue.query.max
(gauge)
[DogStatsD] The max size of the serf query queue
consul.serf.queue.query.median
(gauge)
[DogStatsD] The median size of the serf query queue
consul.serf.queue.query.quantile
(gauge)
[Prometheus] The quantile for the size of the serf query queue
consul.serf.snapshot.appendline.95percentile
(gauge)
[DogStatsD] The p95 of the time taken by the Consul agent to append an entry into the existing log.
Shown as millisecond
consul.serf.snapshot.appendline.avg
(gauge)
[DogStatsD] The avg of the time taken by the Consul agent to append an entry into the existing log.
Shown as millisecond
consul.serf.snapshot.appendline.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.serf.snapshot.appendline
consul.serf.snapshot.appendline.max
(gauge)
[DogStatsD] The max of the time taken by the Consul agent to append an entry into the existing log.
Shown as millisecond
consul.serf.snapshot.appendline.median
(gauge)
[DogStatsD] The median of the time taken by the Consul agent to append an entry into the existing log.
Shown as millisecond
consul.serf.snapshot.appendline.quantile
(gauge)
[Prometheus] The quantile of the time taken by the Consul agent to append an entry into the existing log.
Shown as millisecond
consul.serf.snapshot.compact.95percentile
(gauge)
[DogStatsD] The p95 of the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction .
Shown as millisecond
consul.serf.snapshot.compact.avg
(gauge)
[DogStatsD] The avg of the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction .
Shown as millisecond
consul.serf.snapshot.compact.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.serf.snapshot.compact
consul.serf.snapshot.compact.max
(gauge)
[DogStatsD] The max of the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction .
Shown as millisecond
consul.serf.snapshot.compact.median
(gauge)
[DogStatsD] The median of the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction .
Shown as millisecond
consul.serf.snapshot.compact.quantile
(gauge)
[Prometheus] The quantile of the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction .
Shown as millisecond

Consul Agent が DogStatsD に送信するメトリクスの詳細については、Consul の Telemetry に関するドキュメントを参照してください。

ネットワークレイテンシーメトリクスの計算方法については、Consul のネットワーク座標系に関するドキュメントを参照してください。

イベント

consul.new_leader:
Datadog Agent は、Consul クラスターが新しいリーダーを選出すると、prev_consul_leadercurr_consul_leader、および consul_datacenter のタグを付けてイベントを送信します。

サービスのチェック

consul.check
Returns OK if the service is up, WARNING if there is an issue and CRITICAL when down.
Statuses: ok, warning, critical, unknown

consul.up
Returns OK if the consul server is up, CRITICAL otherwise.
Statuses: ok, critical

consul.can_connect
Returns OK if the Agent can make HTTP requests to consul, CRITICAL otherwise.
Statuses: ok, critical

consul.prometheus.health
Returns CRITICAL if the check cannot access the metrics endpoint, otherwise returns OK.
Statuses: ok, critical

トラブルシューティング

ご不明な点は、Datadog のサポートチームまでお問合せください。

その他の参考資料

お役に立つドキュメント、リンクや記事: