FoundationDB

Supported OS Linux Mac OS Windows

Integration version2.0.0

Overview

This check monitors FoundationDB through the Datadog Agent. Aside from checking that the FoundationDB cluster is healthy, it also collects numerous metrics and, optionally, FoundationDB transaction logs.

Setup

Both the check and metrics apply to the FoundationDB cluster as a whole, and should only be installed on one host. The host doesn’t need to be one that is running FoundationDB, but just one with access to it.

Installation

The FoundationDB check is included in the Datadog Agent package, but requires the FoundationDB client to be installed.

Configuration

Host

To configure this check for an Agent running on a host:

Metric collection
  1. To start collecting your FoundationDB metrics, edit the foundationdb.d/conf.yaml file in the conf.d/ folder at the root of your Agent’s configuration directory. See the sample foundationdb.d/conf.yaml for all available configuration options.

  2. The cluster to check is determined by searching for a cluster file in the default location. If the cluster file is located elsewhere, set the cluster_file property. Only one cluster can be monitored per check instance.

  3. If the cluster is configured to use TLS, further properties should be set in the configuration. These properties follow the names of the TLS related options given to fdbcli to connect to such a cluster.

  4. Restart the Agent.

Log collection

FoundationDB writes XML logs by default, however, Datadog integrations expect JSON logs. Thus, a configuration change needs to be made to FoundationDB.

  1. Locate your foundationdb.conf file. Under the fdbserver section, add or change the key trace_format to have the value json. Also, make note of the logdir.

    [fdbserver]
    ...
    logdir = /var/log/foundationdb
    trace_format = json
    
  2. Restart the FoundationDB server so the changes take effect. Verify that logs in the logdir are written in JSON.

  3. Ensure that log collection is enabled in your datadog.yaml file:

    logs_enabled: true
    
  4. In the foundationdb.d/conf.yaml file, uncomment the logs section and set the path to the one in your FoundationDB configuration file, appending *.json.

    logs:
      - type: file
        path: /var/log/foundationdb/*.json
        service: foundationdb
        source: foundationdb
    
  5. Make sure the Datadog Agent has the privileges required to list the directory and read its files.

  6. Restart the Datadog Agent.

Containerized

For containerized environments, see the Autodiscovery Integration Templates for guidance on applying the parameters below.

Metric collection
ParameterValue
<INTEGRATION_NAME>foundationdb
<INIT_CONFIG>blank or {}
<INSTANCE_CONFIG>{}
Log collection

Collecting logs is disabled by default in the Datadog Agent. To enable it, see Kubernetes log collection.

ParameterValue
<LOG_CONFIG>{"source": "foundationdb", "service": "<SERVICE_NAME>"}

Validation

Run the Agent’s status subcommand and look for foundationdb under the Checks section.

Data Collected

Metrics

foundationdb.data.least_operating_space_bytes_log_server
(gauge)

Shown as byte
foundationdb.data.moving_data.in_flight_bytes
(gauge)

Shown as byte
foundationdb.data.moving_data.in_queue_bytes
(gauge)

Shown as byte
foundationdb.data.moving_data.total_written_bytes
(gauge)

Shown as byte
foundationdb.data.system_kv_size_bytes
(gauge)

Shown as byte
foundationdb.data.total_disk_used_bytes
(gauge)

Shown as byte
foundationdb.data.total_kv_size_bytes
(gauge)

Shown as byte
foundationdb.datacenter_lag.seconds
(gauge)

Shown as second
foundationdb.degraded_processes
(gauge)

Shown as process
foundationdb.instances
(count)

Shown as instance
foundationdb.latency_probe.batch_priority_transaction_start_seconds
(gauge)
Batch priority transaction start seconds
Shown as second
foundationdb.latency_probe.commit_seconds
(gauge)

Shown as second
foundationdb.latency_probe.immediate_priority_transaction_start_seconds
(gauge)

Shown as second
foundationdb.latency_probe.read_seconds
(gauge)

Shown as second
foundationdb.latency_probe.transaction_start_seconds
(gauge)

Shown as second
foundationdb.machines
(gauge)

Shown as host
foundationdb.process.cpu.usage_cores
(gauge)

Shown as core
foundationdb.process.disk.free_bytes
(gauge)

Shown as byte
foundationdb.process.disk.reads.hz
(gauge)

Shown as read
foundationdb.process.disk.total_bytes
(gauge)

Shown as byte
foundationdb.process.disk.writes.hz
(gauge)

Shown as write
foundationdb.process.memory.available_bytes
(gauge)

Shown as byte
foundationdb.process.memory.limit_bytes
(gauge)

Shown as byte
foundationdb.process.memory.unused_allocated_memory
(gauge)

Shown as byte
foundationdb.process.memory.used_bytes
(gauge)

Shown as byte
foundationdb.process.network.connection_errors.hz
(gauge)

Shown as error
foundationdb.process.network.connections_closed.hz
(gauge)

Shown as connection
foundationdb.process.network.connections_established.hz
(gauge)

Shown as connection
foundationdb.process.network.current_connections
(gauge)

Shown as connection
foundationdb.process.network.megabits_received.hz
(gauge)
foundationdb.process.network.megabits_sent.hz
(gauge)
foundationdb.process.network.tls_policy_failures.hz
(gauge)

Shown as error
foundationdb.process.role.bytes_queried.counter
(count)

Shown as query
foundationdb.process.role.bytes_queried.hz
(gauge)

Shown as query
foundationdb.process.role.commit_latency_statistics.count
(count)

Shown as millisecond
foundationdb.process.role.commit_latency_statistics.max
(gauge)

Shown as millisecond
foundationdb.process.role.commit_latency_statistics.min
(gauge)

Shown as millisecond
foundationdb.process.role.commit_latency_statistics.p25
(gauge)

Shown as millisecond
foundationdb.process.role.commit_latency_statistics.p90
(gauge)

Shown as millisecond
foundationdb.process.role.commit_latency_statistics.p99
(gauge)

Shown as millisecond
foundationdb.process.role.data_lag.seconds
(gauge)

Shown as second
foundationdb.process.role.durability_lag.seconds
(gauge)

Shown as second
foundationdb.process.role.durable_bytes.counter
(count)

Shown as byte
foundationdb.process.role.durable_bytes.hz
(gauge)

Shown as byte
foundationdb.process.role.finished_queries.counter
(count)

Shown as query
foundationdb.process.role.finished_queries.hz
(gauge)

Shown as query
foundationdb.process.role.grv_latency_statistics.default.count
(count)

Shown as millisecond
foundationdb.process.role.grv_latency_statistics.default.max
(gauge)

Shown as millisecond
foundationdb.process.role.grv_latency_statistics.default.min
(gauge)

Shown as millisecond
foundationdb.process.role.grv_latency_statistics.default.p25
(gauge)

Shown as millisecond
foundationdb.process.role.grv_latency_statistics.default.p90
(gauge)

Shown as millisecond
foundationdb.process.role.grv_latency_statistics.default.p99
(gauge)

Shown as millisecond
foundationdb.process.role.input_bytes.counter
(count)

Shown as byte
foundationdb.process.role.input_bytes.hz
(gauge)

Shown as byte
foundationdb.process.role.keys_queried.counter
(count)

Shown as key
foundationdb.process.role.keys_queried.hz
(gauge)

Shown as key
foundationdb.process.role.kvstore_available_bytes
(gauge)

Shown as byte
foundationdb.process.role.kvstore_free_bytes
(gauge)

Shown as byte
foundationdb.process.role.kvstore_inline_keys
(gauge)

Shown as key
foundationdb.process.role.kvstore_total_bytes
(gauge)

Shown as byte
foundationdb.process.role.kvstore_total_nodes
(gauge)

Shown as byte
foundationdb.process.role.kvstore_total_size
(gauge)

Shown as byte
foundationdb.process.role.kvstore_used_bytes
(gauge)

Shown as byte
foundationdb.process.role.local_rate
(gauge)

Shown as unit
foundationdb.process.role.low_priority_queries.counter
(count)

Shown as query
foundationdb.process.role.low_priority_queries.hz
(gauge)

Shown as query
foundationdb.process.role.mutation_bytes.counter
(count)

Shown as byte
foundationdb.process.role.mutation_bytes.hz
(gauge)

Shown as byte
foundationdb.process.role.mutations.counter
(count)

Shown as item
foundationdb.process.role.mutations.hz
(gauge)

Shown as item
foundationdb.process.role.query_queue_max
(gauge)

Shown as query
foundationdb.process.role.queue_length
(gauge)

Shown as item
foundationdb.process.role.read_latency_statistics.count
(count)

Shown as millisecond
foundationdb.process.role.read_latency_statistics.max
(gauge)

Shown as millisecond
foundationdb.process.role.read_latency_statistics.min
(gauge)

Shown as millisecond
foundationdb.process.role.read_latency_statistics.p25
(gauge)

Shown as millisecond
foundationdb.process.role.read_latency_statistics.p90
(gauge)

Shown as millisecond
foundationdb.process.role.read_latency_statistics.p99
(gauge)

Shown as millisecond
foundationdb.process.role.stored_bytes
(gauge)

Shown as byte
foundationdb.process.role.total_queries.counter
(count)

Shown as query
foundationdb.process.role.total_queries.hz
(gauge)

Shown as query
foundationdb.processes
(gauge)

Shown as process
foundationdb.processes_per_role.cluster_controller
(gauge)

Shown as process
foundationdb.processes_per_role.coordinator
(gauge)

Shown as process
foundationdb.processes_per_role.data_distributor
(gauge)

Shown as process
foundationdb.processes_per_role.log
(gauge)

Shown as process
foundationdb.processes_per_role.master
(gauge)

Shown as process
foundationdb.processes_per_role.proxy
(gauge)

Shown as process
foundationdb.processes_per_role.ratekeeper
(gauge)

Shown as process
foundationdb.processes_per_role.resolver
(gauge)

Shown as process
foundationdb.processes_per_role.storage
(gauge)

Shown as process
foundationdb.workload.operations.location_requests.counter
(count)

Shown as operation
foundationdb.workload.operations.location_requests.hz
(gauge)

Shown as operation
foundationdb.workload.operations.low_priority_reads.counter
(count)

Shown as operation
foundationdb.workload.operations.low_priority_reads.hz
(gauge)

Shown as operation
foundationdb.workload.operations.memory_errors.counter
(count)

Shown as operation
foundationdb.workload.operations.memory_errors.hz
(gauge)

Shown as operation
foundationdb.workload.operations.read_requests.counter
(count)

Shown as operation
foundationdb.workload.operations.read_requests.hz
(gauge)

Shown as operation
foundationdb.workload.operations.reads.counter
(count)

Shown as operation
foundationdb.workload.operations.reads.hz
(gauge)

Shown as operation
foundationdb.workload.operations.writes.counter
(count)

Shown as operation
foundationdb.workload.operations.writes.hz
(gauge)

Shown as operation
foundationdb.workload.transactions.committed.counter
(count)

Shown as transaction
foundationdb.workload.transactions.committed.hz
(gauge)

Shown as transaction
foundationdb.workload.transactions.conflicted.counter
(count)

Shown as transaction
foundationdb.workload.transactions.conflicted.hz
(gauge)

Shown as transaction
foundationdb.workload.transactions.rejected_for_queued_too_long.counter
(count)

Shown as transaction
foundationdb.workload.transactions.rejected_for_queued_too_long.hz
(gauge)

Shown as transaction
foundationdb.workload.transactions.started.counter
(count)

Shown as transaction
foundationdb.workload.transactions.started.hz
(gauge)

Shown as transaction
foundationdb.workload.transactions.started_batch_priority.counter
(count)

Shown as transaction
foundationdb.workload.transactions.started_batch_priority.hz
(gauge)

Shown as transaction
foundationdb.workload.transactions.started_default_priority.counter
(count)

Shown as transaction
foundationdb.workload.transactions.started_default_priority.hz
(gauge)

Shown as transaction
foundationdb.workload.transactions.started_immediate_priority.counter
(count)

Shown as transaction
foundationdb.workload.transactions.started_immediate_priority.hz
(gauge)

Shown as transaction

Events

The FoundationDB check does not include any events.

Troubleshooting

Need help? Contact Datadog support.