Boundary

Supported OS Linux Windows Mac OS

Integration version2.2.2

Overview

This check monitors Boundary through the Datadog Agent. The minimum supported version of Boundary is 0.8.0.

Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.

Installation

The Boundary check is included in the Datadog Agent package. No additional installation is needed on your server.

Configuration

Listener

A listener with an ops purpose must be set up in the config.hcl file to enable metrics collection. Here’s an example listener stanza:

controller {
  name = "boundary-controller"
  database {
    url = "postgresql://<username>:<password>@10.0.0.1:5432/<database_name>"
  }
}

listener "tcp" {
  purpose = "api"
  tls_disable = true
}

listener "tcp" {
  purpose = "ops"
  tls_disable = true
}

The boundary.controller.health service check submits as WARNING when the controller is shutting down. To enable this shutdown grace period, update the controller block with a defined wait duration:

controller {
  name = "boundary-controller"
  database {
    url = "env://BOUNDARY_PG_URL"
  }
  graceful_shutdown_wait_duration = "10s"
}

Datadog Agent

  1. Edit the boundary.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your boundary performance data. See the sample boundary.d/conf.yaml for all available configuration options.

  2. Restart the Agent.

Validation

Run the Agent’s status subcommand and look for boundary under the Checks section.

Data Collected

Metrics

boundary.cluster.client.grpc.request_duration_seconds.bucket
(count)
Histogram of latencies for gRPC requests between the cluster and any of its clients.
Shown as second
boundary.cluster.client.grpc.request_duration_seconds.count
(count)
Histogram of latencies for gRPC requests between the cluster and any of its clients.
Shown as second
boundary.cluster.client.grpc.request_duration_seconds.sum
(count)
Histogram of latencies for gRPC requests between the cluster and any of its clients.
Shown as second
boundary.controller.api.http.request_duration_seconds.bucket
(count)
Histogram of latencies for HTTP requests.
Shown as second
boundary.controller.api.http.request_duration_seconds.count
(count)
Histogram of latencies for HTTP requests.
Shown as second
boundary.controller.api.http.request_duration_seconds.sum
(count)
Histogram of latencies for HTTP requests.
Shown as second
boundary.controller.api.http.request_size_bytes.bucket
(count)
Histogram of request sizes for HTTP requests.
Shown as byte
boundary.controller.api.http.request_size_bytes.count
(count)
Histogram of request sizes for HTTP requests.
Shown as byte
boundary.controller.api.http.request_size_bytes.sum
(count)
Histogram of request sizes for HTTP requests.
Shown as byte
boundary.controller.api.http.response_size_bytes.bucket
(count)
Histogram of response sizes for HTTP responses.
Shown as byte
boundary.controller.api.http.response_size_bytes.count
(count)
Histogram of response sizes for HTTP responses.
Shown as byte
boundary.controller.api.http.response_size_bytes.sum
(count)
Histogram of response sizes for HTTP responses.
Shown as byte
boundary.controller.cluster.grpc.request_duration_seconds.bucket
(count)
Histogram of latencies for gRPC requests.
Shown as second
boundary.controller.cluster.grpc.request_duration_seconds.count
(count)
Histogram of latencies for gRPC requests.
Shown as second
boundary.controller.cluster.grpc.request_duration_seconds.sum
(count)
Histogram of latencies for gRPC requests.
Shown as second
boundary.worker.proxy.http.write_header_duration_seconds.bucket
(count)
Histogram of time elapsed after the TLS connection is established to when the first http header is written back from the server.
Shown as second
boundary.worker.proxy.http.write_header_duration_seconds.count
(count)
Histogram of time elapsed after the TLS connection is established to when the first http header is written back from the server.
Shown as second
boundary.worker.proxy.http.write_header_duration_seconds.sum
(count)
Histogram of time elapsed after the TLS connection is established to when the first http header is written back from the server.
Shown as second
boundary.worker.proxy.websocket.active_connections
(gauge)
Count of open websocket proxy connections (to Boundary workers).
Shown as connection
boundary.worker.proxy.websocket.received_bytes.count
(count)
Count of received bytes for Worker proxy websocket connections.
Shown as byte
boundary.worker.proxy.websocket.sent_bytes.count
(count)
Count of sent bytes for Worker proxy websocket connections.
Shown as byte

Events

The Boundary integration does not include any events.

Service Checks

boundary.openmetrics.health
Returns CRITICAL if the Agent is unable to connect to the OpenMetrics endpoint, otherwise returns OK.
Statuses: ok, critical

boundary.controller.health
Returns CRITICAL if the Agent is unable to connect to the controller’s health endpoint, WARNING if the controller received a shutdown signal, otherwise returns OK.
Statuses: ok, warning, critical

Log collection

  1. Collecting logs is disabled by default in the Datadog Agent. Enable it in your datadog.yaml file:

    logs_enabled: true
    
  2. To start collecting your Boundary logs, add this configuration block to your boundary.d/conf.yaml file:

    logs:
       - type: file
         source: boundary
         path: /var/log/boundary/events.ndjson
    

    Change the path parameter value based on your environment. See the sample boundary.d/conf.yaml file for all available configuration options.

Troubleshooting

Need help? Contact Datadog support.