VoltDB

Supported OS Linux Mac OS Windows

Integration version3.2.1

Overview

This check monitors VoltDB through the Datadog Agent.

Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.

Note: This check should only be configured on one Agent per cluster. If you are monitoring a cluster spread across several hosts, install an Agent on each host. However, do not enable the VoltDB integration on more than one host, as this results in duplicate metrics.

Installation

The VoltDB check is included in the Datadog Agent package. No additional installation is needed on your server.

Configuration

  1. Add a datadog-agent user. You can do so by editing your VoltDB deployment.xml file. Note: No specific roles are required, so assign the built-in user role.

    <users>
        <!-- ... -->
        <user name="datadog-agent" password="<PASSWORD>" roles="user" />
    </users>
    
  2. Edit the voltdb.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your VoltDB performance data. See the sample voltdb.d/conf.yaml for all available configuration options.

    init_config:
    
    instances:
      - url: http://localhost:8080
        username: datadog-agent
        password: "<PASSWORD>"
    
  3. Restart the Agent.

TLS support

If TLS/SSL is enabled on the client HTTP port:

  1. Export your certificate CA file in PEM format:

    keytool -exportcert -file /path/to/voltdb-ca.pem -keystore <KEYSTORE> -storepass <PASSWORD> -alias voltdb -rfc
    
  2. Export your certificate in PEM format:

    openssl pkcs12 -nodes -in <KEYSTORE> -out /path/to/voltdb.pem -password pass:<PASSWORD>
    

    The resulting file should contain the unencrypted private key and the certificate:

    -----BEGIN PRIVATE KEY-----
    <Private key contents...>
    -----END PRIVATE KEY-----
    -----BEGIN CERTIFICATE-----
    <Certificate contents...>
    -----END CERTIFICATE-----
    
  3. In your instance configuration, point url to the TLS-enabled client endpoint, and set the tls_cert and tls_ca_cert options. For example:

    instances:
    - # ...
      url: https://localhost:8443
      tls_cert: /path/to/voltdb.pem
      tls_ca_cert: /path/to/voltdb-ca.pem
    
  4. Restart the Agent.

Log collection

  1. Collecting logs is disabled by default in the Datadog Agent, enable it in your datadog.yaml file:

    logs_enabled: true
    
  2. Add this configuration block to your voltdb.d/conf.yaml file to start collecting your VoltDB logs:

    logs:
      - type: file
        path: /var/log/voltdb.log
        source: voltdb
    

Change the path value based on your environment. See the sample voltdb.d/conf.yaml file for all available configuration options.

  1. Restart the Agent.

To enable logs for Kubernetes environments, see Kubernetes Log Collection.

Validation

Run the Agent’s status subcommand and look for voltdb under the Checks section.

Data Collected

Metrics

voltdb.commandlog.fsync_interval
(gauge)
Average interval between the last 10 fsync system calls.
Shown as millisecond
voltdb.commandlog.in_use_segment_count
(gauge)
The number of segment files currently in use for command logging.
voltdb.commandlog.outstanding_bytes
(gauge)
Size of pending command log data: data for transactions that have been initiated but the log has yet to be written to the disk. For synchronous logging this value is always zero.
Shown as byte
voltdb.commandlog.outstanding_transactions
(gauge)
The number of transactions of pending command log data: the number of transactions that have been initiated for which the log has yet to be written to disk. For synchronous logging this value is always zero.
voltdb.commandlog.segment_count
(gauge)
The number of segment files allocated (including currently unused segments).
voltdb.cpu.percent_used
(gauge)
Percentage of total CPU available used by the database server process.
Shown as percent
voltdb.export.latency.avg
(gauge)
The average time between when records are inserted and when they are acknowledged by the target.
Shown as millisecond
voltdb.export.latency.max
(gauge)
The maximum time between when a record was inserted and when it was acknowledged by the target.
Shown as millisecond
voltdb.export.queue_gap
(gauge)
The number of records missing from the queue for the current stream and partition.
voltdb.export.records_pending
(gauge)
The number of records out of TUPLE_COUNT still waiting to be written to or acknowledged by the target.
voltdb.export.records_queued
(count)
The number of records queued to the export target.
voltdb.export.time_since_last_acked
(gauge)
Time since the last tuple was acknowledged as received by the target.
Shown as second
voltdb.export.time_since_last_queued
(gauge)
Time since the most recent tuple was added to the export queue for this partition.
Shown as second
voltdb.gc.newgen_avg_gc_time
(gauge)
Average time taken by young generation collections.
Shown as millisecond
voltdb.gc.newgen_gc_count
(count)
Total number of times young generation garbage collection was performed.
voltdb.gc.oldgen_avg_gc_time
(gauge)
Average time taken by young generation collections.
Shown as millisecond
voltdb.gc.oldgen_gc_count
(count)
Total number of times old generation garbage collection was performed.
voltdb.idletime.avg_wait
(gauge)
The average amount of time the execution site had to wait for a new task (in microseconds).
Shown as microsecond
voltdb.idletime.max_wait
(gauge)
The maximum amount of time the execution site had to wait for a new task (in nanoseconds).
Shown as microsecond
voltdb.idletime.min_wait
(gauge)
The minimum amount of time the execution site had to wait for a new task (in nanoseconds).
Shown as microsecond
voltdb.idletime.stddev
(gauge)
The standard deviation of the waiting time (in microseconds).
Shown as microsecond
voltdb.idletime.wait.count
(count)
The number of times the execution site had to wait for a new task (that is, the queue was empty).
voltdb.idletime.wait.pct
(gauge)
The percentage of time the execution site was waiting for a new task (that is, the site was idle).
voltdb.idletime.wait.total
(gauge)
The cumulative number of times the execution site had to wait for a new task (that is, the queue was empty).
voltdb.import.failures.count
(count)
The number of import transactions that failed.
voltdb.import.failures.total
(gauge)
The cumulative number of import transactions that failed.
voltdb.import.outstanding_requests
(gauge)
The number of records read from the import stream and waiting to be inserted into the database.
Shown as record
voltdb.import.retries.count
(count)
The number of attempts to replay failed transactions.
voltdb.import.retries.total
(gauge)
The cumulative number of attempts to replay failed transactions.
voltdb.import.successes.count
(count)
The number of import transactions that succeeded.
voltdb.import.successes.total
(gauge)
The cumulative number of import transactions that succeeded.
voltdb.index.entry_count
(gauge)
Number of index entries currently in the partition.
voltdb.index.memory_estimate
(gauge)
Estimated amount of memory consumed by the current index entries.
Shown as kibibyte
voltdb.io.bytes_read
(count)
Total number of bytes of data sent from the client to the host.
Shown as byte
voltdb.io.bytes_written
(count)
Total number of bytes of data sent from the host to the client.
Shown as byte
voltdb.io.messages_read
(count)
Total number of individual messages sent from the client to the host.
Shown as byte
voltdb.io.messages_written
(count)
Total number of individual messages sent from the host to the client.
Shown as byte
voltdb.latency.count
(gauge)
Number of transactions during the interval.
voltdb.latency.interval
(gauge)
Length of the measurement interval: five seconds (5000).
Shown as millisecond
voltdb.latency.max
(gauge)
Maximum latency during the interval
Shown as microsecond
voltdb.latency.p50
(gauge)
50th percentile latency.
Shown as microsecond
voltdb.latency.p95
(gauge)
95th percentile latency.
Shown as microsecond
voltdb.latency.p99
(gauge)
99th percentile latency.
Shown as microsecond
voltdb.latency.p999
(gauge)
99.9th percentile latency.
Shown as microsecond
voltdb.latency.p9999
(gauge)
99.99th percentile latency.
Shown as microsecond
voltdb.latency.p99999
(gauge)
99.999th percentile latency.
Shown as microsecond
voltdb.latency.transactions_per_sec
(gauge)
Number of transactions per second during the interval
Shown as transaction
voltdb.memory.index
(gauge)
Amount of memory currently in use for storing database indexes.
Shown as kibibyte
voltdb.memory.java.max_heap
(gauge)
Maximum heap size of the Java runtime environment.
Shown as kibibyte
voltdb.memory.java.unused
(gauge)
Amount of memory allocated by Java but unused (free space in the Java heap).
Shown as kibibyte
voltdb.memory.java.used
(gauge)
Amount of memory allocated by Java and currently in use by VoltDB.
Shown as kibibyte
voltdb.memory.physical
(gauge)
Total size of physical memory on the server.
Shown as kibibyte
voltdb.memory.pooled
(gauge)
Total size of memory allocated for tasks other than database records, indexes, and strings.
Shown as kibibyte
voltdb.memory.rss
(gauge)
Total amount of memory allocated to the VoltDB processes on the server.
Shown as kibibyte
voltdb.memory.string
(gauge)
Amount of memory currently in use for storing string, binary and geospatial data that is not stored inline in the database record.
Shown as kibibyte
voltdb.memory.tuple_allocated
(gauge)
Amount of memory allocated for the storage of database records (including free space).
Shown as kibibyte
voltdb.memory.tuple_count
(gauge)
Total number of database records currently in memory.
Shown as kibibyte
voltdb.memory.tuple_data
(gauge)
Amount of memory currently in use for storing database records.
Shown as kibibyte
voltdb.procedure.aborts
(count)
Total number of times the procedure was aborted.
Shown as byte
voltdb.procedure.avg_execution_time
(gauge)
Average amount of time it took to execute the stored procedure.
Shown as nanosecond
voltdb.procedure.avg_parameter_set_size
(gauge)
Average size of the parameters passed as input to the procedure.
Shown as byte
voltdb.procedure.avg_result_size
(gauge)
Average size of the results returned by the procedure.
Shown as byte
voltdb.procedure.failures
(count)
Total number of times the procedure failed unexpectedly.
Shown as byte
voltdb.procedure.invocations
(count)
The number of invocations of this procedure at this site.
voltdb.procedure.max_execution_time
(gauge)
Maximum amount of time it took to execute the stored procedure.
Shown as nanosecond
voltdb.procedure.max_parameter_set_size
(gauge)
Maximum size of the parameters passed as input to the procedure.
Shown as byte
voltdb.procedure.max_result_size
(gauge)
Maximum size of the results returned by the procedure.
Shown as byte
voltdb.procedure.min_execution_time
(gauge)
Minimum amount of time it took to execute the stored procedure.
Shown as nanosecond
voltdb.procedure.min_parameter_set_size
(gauge)
Minimim size of the parameters passed as input to the procedure.
Shown as byte
voltdb.procedure.min_result_size
(gauge)
Minimum size of the results returned by the procedure.
Shown as byte
voltdb.procedure.successes
(count)
Total number of times the procedure succeeded.
Shown as byte
voltdb.procedure.timed_invocations
(count)
Total number of invocations used to measure the minimum, maximum, and average execution time.
voltdb.procedureoutput.avg_result_size
(gauge)
The average result set size in bytes.
Shown as byte
voltdb.procedureoutput.invocations.count
(count)
The number of invocations of this procedure.
voltdb.procedureoutput.invocations.total
(gauge)
The cumulative number of invocations of this procedure.
voltdb.procedureoutput.max_result_size
(gauge)
The maximum result set size in bytes.
Shown as byte
voltdb.procedureoutput.min_result_size
(gauge)
The minimum result set size in bytes.
Shown as byte
voltdb.procedureoutput.total_result_size
(gauge)
The total output returned by all invocations of this stored procedure measured in megabytes.
Shown as mebibyte
voltdb.procedureoutput.weighted_perc
(gauge)
A weighted average expressed as a percentage of the result set size returned by invocations of this stored procedure compared to all stored procedure invocations.
voltdb.procedureprofile.aborts.count
(count)
The number of times the procedure was aborted.
voltdb.procedureprofile.aborts.total
(gauge)
The cumulative number of times the procedure was aborted.
voltdb.procedureprofile.avg_time
(gauge)
The average length of time (in nanoseconds) it took to execute the stored procedure.
Shown as nanosecond
voltdb.procedureprofile.failures.count
(count)
The number of times the procedure failed unexpectedly (as opposed to user aborts or expected errors, such as constraint violations).
voltdb.procedureprofile.failures.total
(gauge)
The number of times the procedure failed unexpectedly (as opposed to user aborts or expected errors, such as constraint violations).
voltdb.procedureprofile.invocations.count
(count)
The number of invocations of this procedure.
voltdb.procedureprofile.invocations.total
(gauge)
The total number of invocations of this procedure.
voltdb.procedureprofile.max_time
(gauge)
The maximum length of time (in nanoseconds) it took to execute the stored procedure.
Shown as nanosecond
voltdb.procedureprofile.min_time
(gauge)
The minimum length of time (in nanoseconds) it took to execute the stored procedure.
Shown as nanosecond
voltdb.procedureprofile.weighted_perc
(gauge)
A weighted average expressed as a percentage of the execution time for this stored procedure compared to all stored procedure invocations.
voltdb.queue.avg_wait
(gauge)
The average length of time (in microseconds) tasks were waiting in the queue in the last five seconds.
Shown as microsecond
voltdb.queue.current_depth
(gauge)
The number of tasks currently in the queue.
Shown as task
voltdb.queue.max_wait
(gauge)
The maximum length of time (in microseconds) tasks were waiting in the queue in the last five seconds.
Shown as microsecond
voltdb.queue.poll_count_per_sec
(gauge)
The number of tasks that left the queue
Shown as task
voltdb.snapshot_status.duration
(gauge)
Amount of time it took to complete the snapshot.
Shown as second
voltdb.snapshot_status.size
(gauge)
Total size of the snapshot file.
Shown as byte
voltdb.snapshot_status.throughput
(gauge)
Average number of bytes per second written to the file during the snapshot process.
Shown as byte
voltdb.table.percent_full
(gauge)
Percentage of the row limit currently in use by table rows in this partition. If no row limit is set this is zero.
Shown as percent
voltdb.table.string_data_memory
(gauge)
Total memory used for storing non-inline variable length data associated with this table in this partition.
Shown as kibibyte
voltdb.table.tuple_allocated_memory
(gauge)
Total size of memory allocated for storing inline data associated with this table in this partition. Can exceed currently used memory.
Shown as kibibyte
voltdb.table.tuple_count
(gauge)
Number of rows currently stored for this table in the current partition.
voltdb.table.tuple_data_memory
(gauge)
Total memory used for storing inline data associated with this table and this partition.
Shown as kibibyte
voltdb.table.tuple_limit
(gauge)
The row limit for this table. Row limits are optional and are defined in the schema as a maximum number of rows that any partition can contain. If no row limit is set, this value is null.
Shown as row

Events

This check does not include any events.

Service Checks

voltdb.can_connect
Returns CRITICAL if the Agent is unable to reach the configured VoltDB client URL, OK otherwise.
Statuses: ok, critical

Troubleshooting

Need help? Contact Datadog support.

Further Reading

Additional helpful documentation, links, and articles: