---
title: Resilience4j
description: >-
  Resilience4j offers features like Circuit Breaker, Rate Limiter, Bulkhead, and
  Retry
breadcrumbs: Docs > Integrations > Resilience4j
---

# Resilience4j
Supported OS Integration version1.1.1
## Overview{% #overview %}

[Resilience4j](https://github.com/resilience4j/resilience4j) is a lightweight fault tolerance library inspired by Netflix Hystrix, but designed for functional programming. This check monitors [Resilience4j](https://resilience4j.readme.io/docs/micrometer#prometheus) through the Datadog Agent.

## Setup{% #setup %}

### Installation{% #installation %}

To install the Resilience4j check on your host:

1. Install the [developer toolkit] ([https://docs.datadoghq.com/developers/integrations/python/](https://docs.datadoghq.com/developers/integrations/python.md)) on any machine.

1. Run `ddev release build resilience4j` to build the package.

1. [Download the Datadog Agent](https://app.datadoghq.com/account/settings/agent/latest).

1. Upload the build artifact to any host with an Agent, and run `datadog-agent integration install -w path/to/resilience4j/dist/<ARTIFACT_NAME>.whl`.

### Configuration{% #configuration %}

1. Edit the `resilience4j/conf.yaml` file in the `conf.d/` folder at the root of your Agent's configuration directory to start collecting your Resilience4j performance data. See the [sample resilience4j/conf.yaml](https://github.com/DataDog/integrations-extras/blob/master/resilience4j/datadog_checks/resilience4j/data/conf.yaml.example) for all available configuration options.

1. [Restart the Agent](https://docs.datadoghq.com/agent/guide/agent-commands.md#start-stop-and-restart-the-agent).

### Validation{% #validation %}

[Run the Agent's status subcommand](https://docs.datadoghq.com/agent/guide/agent-commands.md#agent-status-and-information) and look for `resilience4j` under the Checks section.

## Data Collected{% #data-collected %}

### Metrics{% #metrics %}

|  |
|  |
| **resilience4j.bulkhead.available.concurrent.calls**(gauge)      | The number of available permissions                                         |
| **resilience4j.bulkhead.core.thread.pool.size**(gauge)           | The core size of the bulkhead thread pool                                   |
| **resilience4j.bulkhead.max.allowed.concurrent.calls**(gauge)    | The maximum number of available permissions                                 |
| **resilience4j.bulkhead.max.thread.pool.size**(gauge)            | The maximum allowed size of the thread pool                                 |
| **resilience4j.bulkhead.queue.capacity**(gauge)                  | The maximum allowed size of the queue                                       |
| **resilience4j.bulkhead.queue.depth**(gauge)                     | The number of tasks in the queue                                            |
| **resilience4j.bulkhead.thread.pool.size**(gauge)                | The current size of the thread pool                                         |
| **resilience4j.circuitbreaker.buffered.calls**(gauge)            | The number of buffered failed calls stored in the ring buffer               |
| **resilience4j.circuitbreaker.calls**(count)                     | Total number of calls*Shown as unit*                                        |
| **resilience4j.circuitbreaker.calls.seconds.bucket**(count)      | Sum of number of successful calls*Shown as second*                          |
| **resilience4j.circuitbreaker.calls.seconds.count**(count)       | Count of number of successful calls*Shown as second*                        |
| **resilience4j.circuitbreaker.calls.seconds.max**(gauge)         | Max of number of successful calls*Shown as second*                          |
| **resilience4j.circuitbreaker.calls.seconds.sum**(count)         | Sum of number of successful calls*Shown as second*                          |
| **resilience4j.circuitbreaker.failure.rate**(gauge)              | The failure rate of the circuit breaker                                     |
| **resilience4j.circuitbreaker.max.buffered.calls**(gauge)        | The maximum number of buffered calls which can be stored in the ring buffer |
| **resilience4j.circuitbreaker.not.permitted.calls**(count)       | Total number of not permitted calls*Shown as unit*                          |
| **resilience4j.circuitbreaker.not.permitted.calls.count**(count) | Count of number of not permitted calls*Shown as unit*                       |
| **resilience4j.circuitbreaker.slow.call.rate**(gauge)            | The slow call rate of the circuit breaker                                   |
| **resilience4j.circuitbreaker.state**(gauge)                     | The states of the circuit breaker                                           |
| **resilience4j.ratelimiter.available.permissions**(gauge)        | The number of available permissions                                         |
| **resilience4j.ratelimiter.waiting.threads**(gauge)              | The number of threads waiting for permission                                |
| **resilience4j.retry.calls**(gauge)                              | The number of successful calls without a retry attempt                      |
| **resilience4j.retry.calls.count**(count)                        | The number of successful calls without a retry attempt                      |
| **resilience4j.timelimiter.calls**(count)                        | Total number of calls which were successful*Shown as unit*                  |

### Service Checks{% #service-checks %}

**resilience4j.openmetrics.health**

Returns `CRITICAL` if the Agent is unable to connect to the Resilience4j OpenMetrics endpoint, otherwise returns `OK`.

*Statuses: ok, critical*

### Events{% #events %}

Resilience4j does not include any events.

## Troubleshooting{% #troubleshooting %}

Need help? Contact the [maintainer](https://github.com/DataDog/integrations-extras/blob/master/resilience4j/manifest.json) of this integration.
