---
title: BentoML
description: >-
  BentoML is an open-source framework for ML model deployment. This integration
  collects BentoML service metrics.
breadcrumbs: Docs > Integrations > BentoML
---

# BentoML
Supported OS Integration version1.4.0
## Overview{% #overview %}

This check monitors [BentoML](https://docs.bentoml.com/en/latest/index.html) through the Datadog Agent.

BentoML is an open-source platform for building, shipping, and running machine learning models in production. This integration enables you to track the health and performance of your BentoML model serving infrastructure directly from Datadog.

By using this integration, you gain visibility into key BentoML metrics such as request throughput, response latency, error rates, and resource utilization. Monitoring these metrics helps you ensure reliable model deployments, quickly detect issues, and optimize the performance of your ML services in production environments.

**Minimum Agent version:** 7.70.1

## Setup{% #setup %}

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the [Autodiscovery Integration Templates](https://docs.datadoghq.com/containers/kubernetes/integrations/) for guidance on applying these instructions.

### Installation{% #installation %}

Starting with Agent version `7.71.0`, the BentoML check is included in the [Datadog Agent](https://app.datadoghq.com/account/settings/agent/latest) package. No additional installation is needed on your environment.

### Configuration{% #configuration %}

#### Metrics{% #metrics %}

The BentoML integration collects data from both the [health API endpoints](https://docs.bentoml.com/en/latest/build-with-bentoml/observability/monitoring-and-data-collection.html#monitoring) and the [Prometheus metrics endpoint](https://docs.bentoml.com/en/latest/build-with-bentoml/observability/metrics.html). By default, BentoML exposes these endpoints, so in most cases, no additional configuration is required on the BentoML side. For more information about these endpoints and how to enable or secure them, refer to the [BentoML observability documentation](https://docs.bentoml.com/en/latest/build-with-bentoml/observability/monitoring-and-data-collection.html#monitoring).

To configure the Datadog Agent to collect BentoML metrics:

1. Edit the `bentoml.d/conf.yaml` file, located in the `conf.d/` directory at the root of your Agent's configuration folder. This file controls how the Agent collects metrics from your BentoML deployment. For a full list of configuration options, see the [sample bentoml.d/conf.yaml](https://github.com/DataDog/integrations-core/blob/master/bentoml/datadog_checks/bentoml/data/conf.yaml.example). Below is a minimal example configuration:

```yaml
init_config:
instances:
  - openmetrics_endpoint: http://localhost:3000/metrics
    tags:
    - bentoml_service: foo # Tag to easily scope metrics
```
[Restart the Agent](https://docs.datadoghq.com/agent/configuration/agent-commands/#start-stop-and-restart-the-agent).
#### Logs{% #logs %}

BentoML logs can be collected by the Datadog Agent using several methods:

- **Agent log collection (recommended)**: Configure the Datadog Agent to tail BentoML log files. See the [BentoML documentation](https://docs.bentoml.com/en/latest/build-with-bentoml/observability/monitoring-and-data-collection.html#view-request-and-schema-logs) for more details.

**For host-based Agents:**

1. Enable log collection in your `datadog.yaml` file (disabled by default):

   ```yaml
   logs_enabled: true
   ```

1. Configure the Agent to tail BentoML logs by editing `bentoml.d/conf.yaml` (or the corresponding file in `conf.d/`):

   ```yaml
   logs:
     - type: file
       path: monitoring/text_summarization/data/*.log
       source: bentoml
       service: <SERVICE>
   ```

Replace `<SERVICE>` with a name that matches your service.

**For containerized environments**:

- Ensure the BentoML log files are mounted inside the Datadog Agent container so they can be accessed and tailed. See [container based log collection](https://docs.datadoghq.com/opentelemetry/setup/otlp_ingest_in_the_agent/?tab=docker#enabling-otlp-ingestion-on-the-datadog-agent) for more information.

**Other log shipping options**:

- **Fluent Bit**: Forward logs to Datadog using [Fluent Bit](https://docs.fluentbit.io/manual/data-pipeline/outputs/datadog).
- **OTLP**: Send logs to the Datadog Agent using the [OpenTelemetry Protocol (OTLP)](https://docs.datadoghq.com/opentelemetry/setup/otlp_ingest_in_the_agent/?tab=docker#enabling-otlp-ingestion-on-the-datadog-agent).

Choose the log collection method that best fits your environment and operational needs. Ensure the logs are tagged correctly with `source:bentoml`.

### Validation{% #validation %}

[Run the Agent's status subcommand](https://docs.datadoghq.com/agent/configuration/agent-commands/#agent-status-and-information) and look for `bentoml` under the Checks section.

## Data Collected{% #data-collected %}

### Metrics{% #metrics-1 %}

|  |
|  |
| **bentoml.endpoint\_livez**(gauge)                      | A liveness probe endpoint that checks if the Service is still alive and needs to restart. This metrics reports a 1 if the endpoint is healthy, 0 otherwise.                      |
| **bentoml.endpoint\_readyz**(gauge)                     | A readiness probe endpoint that indicates if the Service is ready to accept traffic. This metrics reports a 1 if the endpoint is healthy, 0 otherwise.                           |
| **bentoml.service.adaptive\_batch\_size.bucket**(count) | The number of observations since last data collection that fall within a specific upper_bound tag for adaptive batch sizes used during Service execution histogram measurements. |
| **bentoml.service.adaptive\_batch\_size.count**(count)  | The number of observations since last data collection for adaptive batch sizes used during Service execution histogram measurements.                                             |
| **bentoml.service.adaptive\_batch\_size.sum**(count)    | The total batch size units across all observations since last data collection for adaptive batch sizes used during Service execution histogram measurements.                     |
| **bentoml.service.request.count**(count)                | The number of new requests that a Service has processed since last submission*Shown as request*                                                                                  |
| **bentoml.service.request.duration.bucket**(count)      | The number of observations since last data collection that fall within a specific upper_bound tag for request processing duration histogram measurements.                        |
| **bentoml.service.request.duration.count**(count)       | The number of requests processed since last data collection histogram measurements.                                                                                              |
| **bentoml.service.request.duration.sum**(count)         | The total sum of request processing time in seconds across all observations since last data collection histogram measurements.*Shown as second*                                  |
| **bentoml.service.request.in\_progress**(gauge)         | The number of requests that are currently being processed by a Service.*Shown as request*                                                                                        |
| **bentoml.service.time\_since\_last\_request**(gauge)   | The amount of time in seconds since the last request was processed by a Service.*Shown as second*                                                                                |

### Events{% #events %}

The BentoML integration does not include any events.

### Service Checks{% #service-checks %}

See [service_checks.json](https://github.com/DataDog/integrations-core/blob/master/bentoml/assets/service_checks.json) for a list of service checks provided by this integration.

## Troubleshooting{% #troubleshooting %}

Need help? Contact [Datadog support](https://docs.datadoghq.com/help/).
