---
title: Redpanda
description: Monitor the overall health and performance of Redpanda clusters.
breadcrumbs: Docs > Integrations > Redpanda
---

# Redpanda
Supported OS Integration version2.2.0
{% callout %}
# Important note for users on the following Datadog sites: us2.ddog-gov.com

{% alert level="info" %}
To find out if this integration is available in your organization, see your [Datadog Integrations](https://app.datadoghq.com/integrations) page or ask your organization administrator.

To initiate an exception request to enable this integration for your organization, email [support@ddog-gov.com](mailto:support@ddog-gov.com).
{% /alert %}

{% /callout %}

## Overview{% #overview %}

Redpanda is a Kafka API-compatible streaming platform for mission-critical workloads.

Connect Datadog with [Redpanda](https://redpanda.com) to view key metrics and add additional metric groups based on specific user needs.

## Setup{% #setup %}

### Installation{% #installation %}

1. [Download and launch the Datadog Agent](https://docs.datadoghq.com/containers/kubernetes/log.md).
1. Manually install the Redpanda integration. See [Use Community Integrations](https://docs.datadoghq.com/agent/guide/community-integrations-installation-with-docker-agent.md) for more details based on the environment.

{% tab title="Host" %}
#### Host{% #host %}

To configure this check for an Agent running on a host, run `datadog-agent integration install -t datadog-redpanda==2.2.0`.
{% /tab %}

{% tab title="Containerized" %}
#### Containerized{% #containerized %}

For containerized environments, the best way to use this integration with the Docker Agent is to build the Agent with the Redpanda integration installed.

To build an updated version of the Agent:

1. Use the following Dockerfile:

```dockerfile
FROM gcr.io/datadoghq/agent:latest

ARG INTEGRATION_VERSION=2.2.0

RUN agent integration install -r -t datadog-redpanda==${INTEGRATION_VERSION}
```

Build the image and push it to your private Docker registry.

Upgrade the Datadog Agent container image. If you are using a Helm chart, modify the `agents.image` section in the `values.yaml` file to replace the default agent image:

```yaml
agents:
  enabled: true
  image:
    tag: <NEW_TAG>
    repository: <YOUR_PRIVATE_REPOSITORY>/<AGENT_NAME>
```
Use the new `values.yaml` file to upgrade the Agent:
```shell
helm upgrade -f values.yaml <RELEASE_NAME> datadog/datadog
```

{% /tab %}

### Configuration{% #configuration %}

{% tab title="Host" %}
#### Host{% #host %}

##### Metric collection{% #metric-collection %}

To start collecting your Redpanda performance data:

1. Edit the `redpanda.d/conf.yaml` file in the `conf.d/` folder at the root of your [Agent's configuration directory](https://docs.datadoghq.com/agent/guide/agent-configuration-files.md#agent-configuration-directory). See the sample [redpanda.d/conf.yaml.example](https://github.com/DataDog/integrations-extras/blob/master/redpanda/datadog_checks/redpanda/data/conf.yaml.example) file for all available configuration options.

1. [Restart the Agent](https://docs.datadoghq.com/agent/guide/agent-commands.md#start-stop-and-restart-the-agent).

##### Log collection{% #log-collection %}

By default, collecting logs is disabled in the Datadog Agent. Log collection is available for Agent v6.0+.

1. To enable logs, add the following in your `datadog.yaml` file:

   ```yaml
   logs_enabled: true
   ```

1. Make sure `dd-agent` user is member of `systemd-journal` group, if not, run following command as root:

   ```
   usermod -a -G systemd-journal dd-agent
   ```

1. Add the following in your `redpanda.d/conf.yaml` file to start collecting your Redpanda logs:

   ```yaml
    logs:
    - type: journald
      source: redpanda
   ```

{% /tab %}

{% tab title="Containerized" %}
#### Containerized{% #containerized %}

##### Metric collection{% #metric-collection %}

For containerized environments, Autodiscovery is configured by default after the Redpanda check integrates in the Datadog Agent image.

Metrics are automatically collected in Datadog's server. For more information, see [Autodiscovery Integration Templates](https://docs.datadoghq.com/agent/kubernetes/integrations.md).

##### Log collection{% #log-collection %}

By default, log collection is disabled in the Datadog Agent. Log collection is available for Agent v6.0+.

To enable logs, see [Kubernetes Log Collection](https://docs.datadoghq.com/containers/kubernetes/log.md).

| Parameter      | Value                                                   |
| -------------- | ------------------------------------------------------- |
| `<LOG_CONFIG>` | `{"source": "redpanda", "service": "redpanda_cluster"}` |

{% /tab %}

### Validation{% #validation %}

[Run the Agent's status subcommand](https://docs.datadoghq.com/agent/guide/agent-commands.md#agent-status-and-information) and look for `redpanda` under the Checks section.

## Data Collected{% #data-collected %}

### Metrics{% #metrics %}

|  |
|  |
| **redpanda.application.build**(gauge)                                                  | Build information for Redpanda, including the revision and version details                                                                                                                                                                             |
| **redpanda.application.fips\_mode**(gauge)                                             | Indicates whether Redpanda is running in FIPS mode. Possible values: 0 = disabled, 1 = permissive, 2 = enabled                                                                                                                                         |
| **redpanda.application.uptime**(gauge)                                                 | Redpanda uptime in seconds*Shown as second*                                                                                                                                                                                                            |
| **redpanda.authorization.result**(count)                                               | Cumulative count of authorization results, categorized by result type.                                                                                                                                                                                 |
| **redpanda.cloud.client\_backoff**(count)                                              | Total number of requests that backed off*Shown as operation*                                                                                                                                                                                           |
| **redpanda.cloud.client\_pool\_utilization**(gauge)                                    | Current utilization of the object storage client pool, as a percentage                                                                                                                                                                                 |
| **redpanda.cloud.client\_download\_backoff**(count)                                    | Total number of download requests that backed off*Shown as operation*                                                                                                                                                                                  |
| **redpanda.cloud.client\_downloads**(count)                                            | Total number of requests that downloaded an object from cloud storage                                                                                                                                                                                  |
| **redpanda.cloud.client\_lease\_duration**(gauge)                                      | Lease duration histogram.                                                                                                                                                                                                                              |
| **redpanda.cloud.client\_not\_found**(count)                                           | Total number of requests for which the object was not found                                                                                                                                                                                            |
| **redpanda.cloud.client\_num\_borrows**(count)                                         | Count of instances where a shard borrowed an object storage client from another shard.                                                                                                                                                                 |
| **redpanda.cloud.client\_upload\_backoff**(count)                                      | Total number of upload requests that backed off                                                                                                                                                                                                        |
| **redpanda.cloud.client\_uploads**(count)                                              | Total number of requests that uploaded an object to cloud storage                                                                                                                                                                                      |
| **redpanda.cloud.storage.active\_segments**(gauge)                                     | Number of remote log segments that are currently hydrated and available for read operations                                                                                                                                                            |
| **redpanda.cloud.storage.cache\_op\_hit**(count)                                       | Number of get requests for objects that are already in cache.                                                                                                                                                                                          |
| **redpanda.cloud.storage.op\_in\_progress\_files**(gauge)                              | Number of files currently being written to the cache.                                                                                                                                                                                                  |
| **redpanda.cloud.storage.cache\_op\_miss**(count)                                      | Number of get requests that are not satisfied from the cache.                                                                                                                                                                                          |
| **redpanda.cloud.storage.op\_put**(count)                                              | Number of objects written into cache.*Shown as operation*                                                                                                                                                                                              |
| **redpanda.cloud.storage.cache\_space\_files**(gauge)                                  | Current number of objects stored in the cache.                                                                                                                                                                                                         |
| **redpanda.cloud.storage.cache\_space\_hwm\_files**(gauge)                             | High watermark for the number of objects stored in the cache.                                                                                                                                                                                          |
| **redpanda.cloud.storage.cache\_space\_hwm\_size\_bytes**(gauge)                       | High watermark for the total size (in bytes) of cached objects.                                                                                                                                                                                        |
| **redpanda.cloud.storage.cache\_space\_size\_bytes**(gauge)                            | Sum of size of cached objects.*Shown as byte*                                                                                                                                                                                                          |
| **redpanda.cloud.storage.cache\_space\_tracker\_size**(gauge)                          | Current count of entries in the cache access tracker.                                                                                                                                                                                                  |
| **redpanda.cloud.storage.cache\_space\_tracker\_syncs**(count)                         | Number of times the cache access tracker was synchronized with disk data.*Shown as operation*                                                                                                                                                          |
| **redpanda.cloud.storage\_cache\_trim\_carryover\_trims**(count)                       | Count of times the cache trim operation was invoked using a carryover strategy.                                                                                                                                                                        |
| **redpanda.cloud.storage\_cache\_trim\_exhaustive\_trims**(count)                      | Count of instances where a fast cache trim was insufficient and an exhaustive trim was required.                                                                                                                                                       |
| **redpanda.cloud.storage\_cache\_trim\_failed\_trims**(count)                          | Count of cache trim operations that failed to free the expected amount of space, possibly indicating a bug or misconfiguration.                                                                                                                        |
| **redpanda.cloud.storage\_cache\_trim\_fast\_trims**(count)                            | Count of successful fast cache trim operations.                                                                                                                                                                                                        |
| **redpanda.cloud.storage\_cache\_trim\_in\_mem\_trims**(count)                         | Count of cache trim operations performed using the in-memory access tracker.                                                                                                                                                                           |
| **redpanda.cloud.storage\_cloud\_log\_size**(gauge)                                    | Total size (in bytes) of user-visible log data stored in Tiered Storage. This value increases with every segment offload and decreases when segments are deleted due to retention or compaction.                                                       |
| **redpanda.cloud.storage.deleted\_segments**(count)                                    | Number of segments that have been deleted from S3 for the topic. This may grow due to retention or non compacted segments being replaced with their compacted equivalent.                                                                              |
| **redpanda.cloud.storage.errors**(count)                                               | Cumulative count of errors encountered during object storage operations, segmented by direction.*Shown as error*                                                                                                                                       |
| **redpanda.cloud.storage.housekeeping.drains**(gauge)                                  | Number of times the object storage upload housekeeping queue was fully drained.                                                                                                                                                                        |
| **redpanda.cloud.storage.housekeeping.jobs\_completed**(count)                         | Number of executed housekeeping jobs                                                                                                                                                                                                                   |
| **redpanda.cloud.storage.housekeeping.jobs\_failed**(count)                            | Number of failed housekeeping jobs*Shown as error*                                                                                                                                                                                                     |
| **redpanda.cloud.storage.housekeeping.jobs\_skipped**(count)                           | Number of object storage housekeeping jobs that were skipped during execution.                                                                                                                                                                         |
| **redpanda.cloud.storage.housekeeping.pauses**(gauge)                                  | Number of times object storage upload housekeeping was paused.                                                                                                                                                                                         |
| **redpanda.cloud.storage\_housekeeping\_requests\_throttled\_average\_rate**(gauge)    | Average rate (per shard) of requests that were throttled during object storage operations.                                                                                                                                                             |
| **redpanda.cloud.storage.housekeeping.resumes**(gauge)                                 | Number of times when object storage upload housekeeping resumed after a pause.                                                                                                                                                                         |
| **redpanda.cloud.storage.housekeeping.rounds**(count)                                  | Number of upload housekeeping rounds                                                                                                                                                                                                                   |
| **redpanda.cloud.storage.jobs.cloud\_segment\_reuploads**(gauge)                       | Number of segment reuploads from cloud storage sources (cloud storage cache or direct download from cloud storage)                                                                                                                                     |
| **redpanda.cloud.storage.jobs.local\_segment\_reuploads**(gauge)                       | Number of segment reuploads from local data directory                                                                                                                                                                                                  |
| **redpanda.cloud.storage.jobs.manifest\_reuploads**(gauge)                             | Number of manifest reuploads performed by all housekeeping jobs                                                                                                                                                                                        |
| **redpanda.cloud.storage.jobs.metadata\_syncs**(gauge)                                 | Number of archival configuration updates performed by all housekeeping jobs                                                                                                                                                                            |
| **redpanda.cloud.storage.jobs.segment\_deletions**(gauge)                              | Number of segments deleted by all housekeeping jobs                                                                                                                                                                                                    |
| **redpanda.cloud.storage\_limits\_downloads\_throttled\_sum**(count)                   | Total cumulative time (in milliseconds) during which downloads were throttled.                                                                                                                                                                         |
| **redpanda.cloud.storage\_partition\_manifest\_uploads**(count)                        | Total number of successful partition manifest uploads to object storage.                                                                                                                                                                               |
| **redpanda.cloud.storage\_partition\_readers**(gauge)                                  | Number of active partition reader instances (fetch/timequery operations) reading from Tiered Storage.                                                                                                                                                  |
| **redpanda.cloud.storage\_partition\_readers\_delayed**(count)                         | Count of partition read operations delayed due to reaching the reader limit, suggesting potential saturation of Tiered Storage reads.                                                                                                                  |
| **redpanda.cloud.storage\_paused\_archivers**(gauge)                                   | Number of paused archivers.                                                                                                                                                                                                                            |
| **redpanda.cloud.storage.readers**(gauge)                                              | Total number of segments pending deletion from the cloud for the topic                                                                                                                                                                                 |
| **redpanda.cloud.storage\_segment\_index\_uploads**(count)                             | Total number of successful segment index uploads to object storage.                                                                                                                                                                                    |
| **redpanda.cloud.storage\_segment\_materializations\_delayed**(count)                  | Count of segment materialization operations that were delayed because of reader limit constraints.                                                                                                                                                     |
| **redpanda.cloud.storage\_segment\_readers\_delayed**(count)                           | Count of segment reader operations delayed due to reaching the reader limit. This indicates a cluster is saturated with Tiered Storage reads.                                                                                                          |
| **redpanda.cloud.storage\_segment\_uploads**(count)                                    | Total number of successful data segment uploads to object storage.                                                                                                                                                                                     |
| **redpanda.cloud.storage.segments**(gauge)                                             | Total number of uploaded bytes for the topic                                                                                                                                                                                                           |
| **redpanda.cloud.storage.segments\_pending\_deletion**(gauge)                          | Number of read cursors for hydrated remote log segments                                                                                                                                                                                                |
| **redpanda.cloud.storage\_spillover\_manifest\_uploads**(count)                        | Total number of successful spillover manifest uploads to object storage.                                                                                                                                                                               |
| **redpanda.cloud.storage\_spillover\_manifests\_materialized\_bytes**(gauge)           | Total bytes of memory used by spilled manifests that are currently cached in memory.                                                                                                                                                                   |
| **redpanda.cloud.storage\_spillover\_manifests\_materialized\_count**(gauge)           | Count of spilled manifests currently held in memory cache.                                                                                                                                                                                             |
| **redpanda.cloud.storage.uploaded\_bytes**(count)                                      | Total number of accounted segments in the cloud for the topic*Shown as byte*                                                                                                                                                                           |
| **redpanda.cluster.brokers**(gauge)                                                    | Number of configured brokers in the cluster                                                                                                                                                                                                            |
| **redpanda.cluster.controller\_log\_limit\_requests\_dropped**(count)                  | REMOVED: use redpanda.controller.log_limit_requests_dropped instead*Shown as request*                                                                                                                                                                  |
| **redpanda.controller.log\_limit\_requests\_available**(gauge)                         | Controller log rate limiting. Available rps for group*Shown as request*                                                                                                                                                                                |
| **redpanda.controller.log\_limit\_requests\_dropped**(count)                           | Controller log rate limiting. Amount of requests that are dropped due to exceeding limit in group*Shown as request*                                                                                                                                    |
| **redpanda.cluster.features\_enterprise\_license\_expiry\_sec**(gauge)                 | Number of seconds remaining until the Enterprise Edition license expires.                                                                                                                                                                              |
| **redpanda.cluster.latest\_cluster\_metadata\_manifest\_age**(gauge)                   | The amount of time in seconds since the last time Redpanda uploaded metadata files to Tiered Storage for your cluster. A value of 0 indicates metadata has not yet been uploaded.                                                                      |
| **redpanda.cluster.members\_backend\_queued\_node\_operations**(gauge)                 | Number of queued node operations.                                                                                                                                                                                                                      |
| **redpanda.cluster.non\_homogenous\_fips\_mode**(gauge)                                | Count of brokers whose FIPS mode configuration differs from the rest of the cluster.                                                                                                                                                                   |
| **redpanda.cluster.partitions**(gauge)                                                 | Total number of logical partitions managed by the cluster. This includes partitions for the controller topic but excludes replicas.                                                                                                                    |
| **redpanda.partitions.moving\_from\_node**(gauge)                                      | Number of partition replicas that are in the process of being removed from a broker.                                                                                                                                                                   |
| **redpanda.partitions.moving\_to\_node**(gauge)                                        | Number of partition replicas in the cluster currently being added or moved to a broker.                                                                                                                                                                |
| **redpanda.partitions.node\_cancelling\_movements**(gauge)                             | During a partition movement cancellation operation, the number of partition replicas that were being moved that now need to be canceled.                                                                                                               |
| **redpanda.cluster.partition\_num\_with\_broken\_rack\_constraint**(gauge)             | Number of partitions that don't satisfy the rack awareness constraint                                                                                                                                                                                  |
| **redpanda.cluster.partition\_schema\_id\_validation\_records\_failed**(count)         | Count of records that failed schema ID validation during ingestion.                                                                                                                                                                                    |
| **redpanda.cluster.replicas**(gauge)                                                   | REMOVED: Use either redpanda.cluster.brokers for the number of brokers or redpanda.kafka.replicas for the number of replicas per topic                                                                                                                 |
| **redpanda.cluster.topics**(gauge)                                                     | Number of topics in the cluster                                                                                                                                                                                                                        |
| **redpanda.cluster.unavailable\_partitions**(gauge)                                    | Number of partitions that are unavailable due to a lack of quorum among their replica set.                                                                                                                                                             |
| **redpanda.reactor.cpu\_busy\_seconds**(gauge)                                         | Total CPU busy time in seconds*Shown as second*                                                                                                                                                                                                        |
| **redpanda.debug\_bundle.failed\_generation\_count**(count)                            | Running count of debug bundle generation failures, reported per shard.                                                                                                                                                                                 |
| **redpanda.debug\_bundle.last\_failed\_bundle\_timestamp\_seconds**(gauge)             | Unix epoch timestamp of the last failed debug bundle generation, per shard.                                                                                                                                                                            |
| **redpanda.debug\_bundle.last\_successful\_bundle\_timestamp\_seconds**(gauge)         | Unix epoch timestamp of the last successfully generated debug bundle, per shard.                                                                                                                                                                       |
| **redpanda.debug\_bundle.successful\_generation\_count**(count)                        | Running count of successfully generated debug bundles, reported per shard.                                                                                                                                                                             |
| **redpanda.iceberg.rest\_client\_active\_gets**(gauge)                                 | Number of active GET requests.                                                                                                                                                                                                                         |
| **redpanda.iceberg.rest\_client\_active\_puts**(gauge)                                 | Number of active PUT requests.                                                                                                                                                                                                                         |
| **redpanda.iceberg.rest\_client\_active\_requests**(gauge)                             | Number of active HTTP requests (includes PUT and GET).                                                                                                                                                                                                 |
| **redpanda.iceberg.rest\_client\_num\_commit\_table\_update\_requests**(count)         | Number of requests sent to the commit_table_update endpoint.                                                                                                                                                                                           |
| **redpanda.iceberg.rest\_client\_num\_commit\_table\_update\_requests\_failed**(count) | Number of requests sent to the commit_table_update endpoint that failed.                                                                                                                                                                               |
| **redpanda.iceberg.rest\_client\_num\_create\_namespace\_requests**(count)             | Number of requests sent to the create_namespace endpoint.                                                                                                                                                                                              |
| **redpanda.iceberg.rest\_client\_num\_create\_namespace\_requests\_failed**(count)     | Number of requests sent to the create_namespace endpoint that failed.                                                                                                                                                                                  |
| **redpanda.iceberg.rest\_client\_num\_create\_table\_requests**(count)                 | Number of requests sent to the create_table endpoint.                                                                                                                                                                                                  |
| **redpanda.iceberg.rest\_client\_num\_create\_table\_requests\_failed**(count)         | Number of requests sent to the create_table endpoint that failed.                                                                                                                                                                                      |
| **redpanda.iceberg.rest\_client\_num\_drop\_table\_requests**(count)                   | Number of requests sent to the drop_table endpoint.                                                                                                                                                                                                    |
| **redpanda.iceberg.rest\_client\_num\_drop\_table\_requests\_failed**(count)           | Number of requests sent to the drop_table endpoint that failed.                                                                                                                                                                                        |
| **redpanda.iceberg.rest\_client\_num\_get\_config\_requests**(count)                   | Number of requests sent to the config endpoint.                                                                                                                                                                                                        |
| **redpanda.iceberg.rest\_client\_num\_get\_config\_requests\_failed**(count)           | Number of requests sent to the config endpoint that failed.                                                                                                                                                                                            |
| **redpanda.iceberg.rest\_client\_num\_load\_table\_requests**(count)                   | Number of requests sent to the load_table endpoint.                                                                                                                                                                                                    |
| **redpanda.iceberg.rest\_client\_num\_load\_table\_requests\_failed**(count)           | Number of requests sent to the load_table endpoint that failed.                                                                                                                                                                                        |
| **redpanda.iceberg.rest\_client\_num\_oauth\_token\_requests**(count)                  | Number of requests sent to the oauth_token endpoint.                                                                                                                                                                                                   |
| **redpanda.iceberg.rest\_client\_num\_oauth\_token\_requests\_failed**(count)          | Number of requests sent to the oauth_token endpoint that failed.                                                                                                                                                                                       |
| **redpanda.iceberg.rest\_client\_num\_request\_timeouts**(count)                       | Total number of catalog requests that could no longer be retried because they timed out. This may occur if the catalog is not responding.                                                                                                              |
| **redpanda.iceberg.rest\_client\_num\_transport\_errors**(count)                       | Total number of transport errors (TCP and TLS).                                                                                                                                                                                                        |
| **redpanda.iceberg.rest\_client\_total\_gets**(count)                                  | Number of completed GET requests.                                                                                                                                                                                                                      |
| **redpanda.iceberg.rest\_client\_total\_inbound\_bytes**(count)                        | Total number of bytes received from the Iceberg REST catalog.                                                                                                                                                                                          |
| **redpanda.iceberg.rest\_client\_total\_outbound\_bytes**(count)                       | Total number of bytes sent to the Iceberg REST catalog.                                                                                                                                                                                                |
| **redpanda.iceberg.rest\_client\_total\_puts**(count)                                  | Number of completed PUT requests.                                                                                                                                                                                                                      |
| **redpanda.iceberg.rest\_client\_total\_requests**(count)                              | Number of completed HTTP requests (includes PUT and GET).                                                                                                                                                                                              |
| **redpanda.iceberg.translation\_decompressed\_bytes\_processed**(count)                | Number of bytes consumed post-decompression for processing that may or may not succeed in being processed. For example, if Redpanda fails to communicate with the coordinator preventing processing of a batch, this metric still increases.           |
| **redpanda.iceberg.translation\_dlq\_files\_created**(count)                           | Number of created Parquet files for the dead letter queue (DLQ) table.                                                                                                                                                                                 |
| **redpanda.iceberg.translation\_files\_created**(count)                                | Number of created Parquet files (not counting the DLQ table).                                                                                                                                                                                          |
| **redpanda.iceberg.translation\_invalid\_records**(count)                              | Number of invalid records handled by translation.                                                                                                                                                                                                      |
| **redpanda.iceberg.translation\_parquet\_bytes\_added**(count)                         | Number of bytes in created Parquet files (not counting the DLQ table).                                                                                                                                                                                 |
| **redpanda.iceberg.translation\_parquet\_rows\_added**(count)                          | Number of rows in created Parquet files (not counting the DLQ table).                                                                                                                                                                                  |
| **redpanda.iceberg.translation\_raw\_bytes\_processed**(count)                         | Number of raw, potentially compressed bytes, consumed for processing that may or may not succeed in being processed. For example, if Redpanda fails to communicate with the coordinator preventing processing of a batch, this metric still increases. |
| **redpanda.iceberg.translation\_translations\_finished**(count)                        | Number of finished translator executions.                                                                                                                                                                                                              |
| **redpanda.io\_queue.total\_read\_ops**(count)                                         | Cumulative count of read operations processed by the I/O queue.*Shown as operation*                                                                                                                                                                    |
| **redpanda.io\_queue.total\_write\_ops**(count)                                        | Cumulative count of write operations processed by the I/O queue.*Shown as operation*                                                                                                                                                                   |
| **redpanda.kafka.group\_offset**(gauge)                                                | Consumer group committed offset                                                                                                                                                                                                                        |
| **redpanda.kafka.group\_count**(gauge)                                                 | Number of consumers in a group                                                                                                                                                                                                                         |
| **redpanda.kafka.group\_lag\_max**(gauge)                                              | Maximum lag for any partition in the group                                                                                                                                                                                                             |
| **redpanda.kafka.group\_lag\_sum**(gauge)                                              | Sum of lag for all partitions in the group                                                                                                                                                                                                             |
| **redpanda.kafka.group\_topic\_count**(gauge)                                          | Number of topics in a group                                                                                                                                                                                                                            |
| **redpanda.kafka.handler\_latency\_seconds**(gauge)                                    | Histogram capturing the latency for processing Kafka requests at the broker level.                                                                                                                                                                     |
| **redpanda.kafka.partition\_committed\_offset**(gauge)                                 | Latest committed offset for the partition (i.e. the offset of the last message safely persisted on most replicas).                                                                                                                                     |
| **redpanda.kafka.partitions**(gauge)                                                   | Configured number of partitions for the topic                                                                                                                                                                                                          |
| **redpanda.kafka.quotas\_client\_quota\_throttle\_time**(gauge)                        | Histogram of client quota throttling delays (in seconds) per quota rule and type.                                                                                                                                                                      |
| **redpanda.kafka.quotas\_client\_quota\_throughput**(gauge)                            | Histogram of client quota throughput per quota rule and type.                                                                                                                                                                                          |
| **redpanda.kafka.records\_fetched**(count)                                             | Total number of records fetched from a topic                                                                                                                                                                                                           |
| **redpanda.kafka.records\_produced**(count)                                            | Total number of records produced to a topic                                                                                                                                                                                                            |
| **redpanda.kafka.replicas**(gauge)                                                     | Configured number of replicas for the topic                                                                                                                                                                                                            |
| **redpanda.kafka.request\_bytes**(count)                                               | Total number of bytes produced per topic*Shown as byte*                                                                                                                                                                                                |
| **redpanda.kafka.request\_latency\_seconds**(gauge)                                    | Internal latency of kafka produce requests*Shown as second*                                                                                                                                                                                            |
| **redpanda.kafka.rpc\_sasl\_session\_expiration**(count)                               | Total number of SASL session expirations observed.                                                                                                                                                                                                     |
| **redpanda.kafka.rpc\_sasl\_session\_reauth\_attempts**(count)                         | Total number of SASL reauthentication attempts made by clients.                                                                                                                                                                                        |
| **redpanda.kafka.rpc\_sasl\_session\_revoked**(count)                                  | Total number of SASL sessions that have been revoked.                                                                                                                                                                                                  |
| **redpanda.kafka.under\_replicated\_replicas**(gauge)                                  | Number of under replicated replicas (i.e. replicas that are live, but not at the latest offest)                                                                                                                                                        |
| **redpanda.memory.allocated\_memory**(gauge)                                           | Allocated memory size in bytes*Shown as byte*                                                                                                                                                                                                          |
| **redpanda.memory.available\_memory**(gauge)                                           | Total shard memory potentially available in bytes (free_memory plus reclaimable)*Shown as byte*                                                                                                                                                        |
| **redpanda.memory.available\_memory\_low\_water\_mark**(gauge)                         | The low-water mark for available_memory from process start*Shown as byte*                                                                                                                                                                              |
| **redpanda.memory.free\_memory**(gauge)                                                | Free memory size in bytes*Shown as byte*                                                                                                                                                                                                               |
| **redpanda.node\_status.rpcs\_received**(gauge)                                        | Number of node status RPCs received by this node*Shown as request*                                                                                                                                                                                     |
| **redpanda.node\_status.rpcs\_sent**(gauge)                                            | Number of node status RPCs sent by this node*Shown as request*                                                                                                                                                                                         |
| **redpanda.node\_status.rpcs\_timed\_out**(gauge)                                      | Number of timed out node status RPCs from this node*Shown as request*                                                                                                                                                                                  |
| **redpanda.raft.leadership\_changes**(count)                                           | Number of leadership changes across all partitions of a given topic                                                                                                                                                                                    |
| **redpanda.raft.learners\_gap\_bytes**(gauge)                                          | Total number of bytes that must be delivered to learner replicas to bring them up to date.                                                                                                                                                             |
| **redpanda.raft.recovery\_offsets\_pending**(gauge)                                    | Sum of offsets across partitions on a broker that still need to be recovered.                                                                                                                                                                          |
| **redpanda.raft.recovery\_bandwidth**(gauge)                                           | Available network bandwidth (in bytes per second) for partition movement operations.*Shown as byte*                                                                                                                                                    |
| **redpanda.raft.recovery\_consumed\_bandwidth**(gauge)                                 | Network bandwidth (in bytes per second) currently being consumed for partition movement.                                                                                                                                                               |
| **redpanda.raft.recovery\_partitions\_active**(gauge)                                  | Number of partition replicas currently undergoing recovery on a broker.                                                                                                                                                                                |
| **redpanda.raft.recovery\_partitions\_to\_recover**(gauge)                             | Total count of partition replicas that are pending recovery on a broker.                                                                                                                                                                               |
| **redpanda.pandaproxy.inflight\_requests\_memory\_usage\_ratio**(gauge)                | Memory usage ratio of in-flight requests in the rest_proxy.                                                                                                                                                                                            |
| **redpanda.pandaproxy.inflight\_requests\_usage\_ratio**(gauge)                        | Usage ratio of in-flight requests in the rest_proxy.                                                                                                                                                                                                   |
| **redpanda.pandaproxy.queued\_requests\_memory\_blocked**(gauge)                       | Number of requests queued in rest_proxy, due to memory limitations.                                                                                                                                                                                    |
| **redpanda.pandaproxy.request\_errors**(count)                                         | Total number of rest_proxy server errors*Shown as error*                                                                                                                                                                                               |
| **redpanda.pandaproxy.request\_latency**(gauge)                                        | Internal latency of request for rest_proxy*Shown as millisecond*                                                                                                                                                                                       |
| **redpanda.rpc.active\_connections**(gauge)                                            | Current number of active RPC client connections on a shard.*Shown as connection*                                                                                                                                                                       |
| **redpanda.rpc.received\_bytes**(count)                                                | Number of bytes received from the clients in valid requests.                                                                                                                                                                                           |
| **redpanda.rpc.request\_errors**(count)                                                | Number of rpc errors*Shown as error*                                                                                                                                                                                                                   |
| **redpanda.rpc.request\_latency\_seconds**(gauge)                                      | Histogram capturing the latency (in seconds) for RPC requests.*Shown as second*                                                                                                                                                                        |
| **redpanda.rpc.sent\_bytes**(count)                                                    | Number of bytes sent to clients.                                                                                                                                                                                                                       |
| **redpanda.scheduler.runtime\_seconds**(count)                                         | Accumulated runtime of task queue associated with this scheduling group*Shown as second*                                                                                                                                                               |
| **redpanda.schema\_registry.cache\_schema\_count**(gauge)                              | Total number of schemas currently stored in the Schema Registry cache.                                                                                                                                                                                 |
| **redpanda.schema\_registry.cache\_schema\_memory\_bytes**(gauge)                      | Memory usage (in bytes) by schemas stored in the Schema Registry cache.                                                                                                                                                                                |
| **redpanda.schema\_registry.cache\_subject\_count**(gauge)                             | Count of subjects stored in the Schema Registry cache.                                                                                                                                                                                                 |
| **redpanda.schema\_registry.cache\_subject\_version\_count**(gauge)                    | Count of versions available for each subject in the Schema Registry cache.                                                                                                                                                                             |
| **redpanda.schema\_registry.inflight\_requests\_memory\_usage\_ratio**(gauge)          | Ratio of memory used by in-flight requests in the Schema Registry, reported per shard.                                                                                                                                                                 |
| **redpanda.schema\_registry.inflight\_requests\_usage\_ratio**(gauge)                  | Usage ratio for in-flight Schema Registry requests, reported per shard.                                                                                                                                                                                |
| **redpanda.schema\_registry.queued\_requests\_memory\_blocked**(gauge)                 | Count of Schema Registry requests queued due to memory constraints, reported per shard.                                                                                                                                                                |
| **redpanda.schema\_registry.errors**(count)                                            | Total number of schema_registry server errors*Shown as error*                                                                                                                                                                                          |
| **redpanda.schema\_registry\_latency\_seconds**(gauge)                                 | Histogram capturing the latency (in seconds) for Schema Registry requests. [supported in v2.1.0 and below]*Shown as second*                                                                                                                            |
| **redpanda.schema\_registry.latency\_seconds**(gauge)                                  | Histogram capturing the latency (in seconds) for Schema Registry requests. [supported in v2.2.0+]*Shown as second*                                                                                                                                     |
| **redpanda.security.audit\_errors**(count)                                             | Cumulative count of errors encountered when creating or publishing audit event log entries.                                                                                                                                                            |
| **redpanda.security.audit\_last\_event\_timestamp\_seconds**(count)                    | Unix epoch timestamp of the last successful audit log event publication.                                                                                                                                                                               |
| **redpanda.storage.cache\_disk\_free\_bytes**(gauge)                                   | Amount of free disk space (in bytes) available on the cache storage.                                                                                                                                                                                   |
| **redpanda.storage.cache\_disk\_free\_space\_alert**(gauge)                            | Alert indicator for cache storage free space, where: 0 = OK, 1 = Low space, 2 = Degraded                                                                                                                                                               |
| **redpanda.storage.cache\_disk\_total\_bytes**(gauge)                                  | Amount of total disk space (in bytes) available on the cache storage.                                                                                                                                                                                  |
| **redpanda.storage.disk\_free\_bytes**(gauge)                                          | Amount of free disk space (in bytes) available on attached storage.*Shown as byte*                                                                                                                                                                     |
| **redpanda.storage.disk\_free\_space\_alert**(gauge)                                   | Alert indicator for overall disk storage free space, where: 0 = OK, 1 = Low space, 2 = Degraded                                                                                                                                                        |
| **redpanda.storage.disk\_total\_bytes**(gauge)                                         | Total size of attached storage, in bytes.*Shown as byte*                                                                                                                                                                                               |
| **redpanda.tls.certificate\_expires\_at\_timestamp\_seconds**(gauge)                   | Unix epoch timestamp for the expiration of the shortest-lived installed TLS certificate.                                                                                                                                                               |
| **redpanda.tls.certificate\_serial**(gauge)                                            | The least significant 4 bytes of the serial number for the certificate that will expire next.                                                                                                                                                          |
| **redpanda.tls.certificate\_valid**(gauge)                                             | Indicator of whether a resource has at least one valid TLS certificate installed. Returns 1 if a valid certificate is present and 0 if not.                                                                                                            |
| **redpanda.tls.loaded\_at\_timestamp\_seconds**(gauge)                                 | Unix epoch timestamp marking the last time a TLS certificate was loaded for a resource.                                                                                                                                                                |
| **redpanda.tls.truststore\_expires\_at\_timestamp\_seconds**(gauge)                    | Unix epoch timestamp representing the expiration time of the shortest-lived certificate authority (CA) in the installed truststore.                                                                                                                    |
| **redpanda.transform.execution\_errors**(count)                                        | Counter for the number of errors encountered during the invocation of data transforms.                                                                                                                                                                 |
| **redpanda.transform.execution\_latency\_sec**(gauge)                                  | Histogram tracking the execution latency (in seconds) for processing a single record via data transforms.                                                                                                                                              |
| **redpanda.transform.failures**(count)                                                 | Counter for each failure encountered by a data transform processor.                                                                                                                                                                                    |
| **redpanda.transform.processor\_lag**(count)                                           | Number of records pending processing in the input topic for a data transform.                                                                                                                                                                          |
| **redpanda.transform.read\_bytes**(count)                                              | Cumulative count of bytes read as input to data transforms.                                                                                                                                                                                            |
| **redpanda.transform.state**(gauge)                                                    | Current count of transform processors in a specific state (running, inactive, or errored).                                                                                                                                                             |
| **redpanda.transform.write\_bytes**(count)                                             | Cumulative count of bytes output from data transforms.                                                                                                                                                                                                 |
| **redpanda.wasm.binary\_executable\_memory\_usage**(gauge)                             | Amount of memory (in bytes) used by executable WebAssembly binaries.                                                                                                                                                                                   |
| **redpanda.wasm.engine\_cpu\_seconds**(count)                                          | Total CPU time (in seconds) consumed by WebAssembly functions.                                                                                                                                                                                         |
| **redpanda.wasm.engine\_max\_memory**(gauge)                                           | Maximum allowed memory (in bytes) allocated for a WebAssembly function.                                                                                                                                                                                |
| **redpanda.wasm.engine\_memory\_usage**(gauge)                                         | Current memory usage (in bytes) by a WebAssembly function.                                                                                                                                                                                             |

### Events{% #events %}

The Redpanda integration does not include any events.

### Service Checks{% #service-checks %}

**redpanda.openmetrics.health**

Returns `CRITICAL` if the check cannot access the metrics endpoint. Returns `OK` otherwise.

*Statuses: ok, critical*

## Troubleshooting{% #troubleshooting %}

Need help? Contact [Datadog support](https://docs.datadoghq.com/help/).