---
title: Vespa
description: Health and performance monitoring for the big data serving engine Vespa
breadcrumbs: Docs > Integrations > Vespa
---

# Vespa
Supported OS Integration version1.1.0
## Overview{% #overview %}

Gather metrics from your [Vespa](https://vespa.ai/) system in real time to:

- Visualize and monitor Vespa state and performance
- Alert on health and availability

## Setup{% #setup %}

The Vespa check is not included in the [Datadog Agent](https://app.datadoghq.com/account/settings/agent/latest) package, so you need to install it.

### Installation{% #installation %}

For Agent v7.21+ / v6.21+, follow the instructions below to install the Vespa check on your host. See [Use Community Integrations](https://docs.datadoghq.com/agent/guide/use-community-integrations.md) to install with the Docker Agent or earlier versions of the Agent.

1. Run the following command to install the Agent integration:

   ```shell
   datadog-agent integration install -t datadog-vespa==<INTEGRATION_VERSION>
   ```

1. Configure your integration similar to core [integrations](https://docs.datadoghq.com/getting_started/integrations.md).

### Configuration{% #configuration %}

To configure the Vespa check:

1. Create a `vespa.d/` folder in the `conf.d/` folder at the root of your [Agent's configuration directory](https://docs.datadoghq.com/agent/guide/agent-configuration-files.md#agent-configuration-directory).
1. Create a `conf.yaml` file in the `vespa.d/` folder previously created.
1. See the [sample vespa.d/conf.yaml](https://github.com/DataDog/integrations-extras/blob/master/vespa/datadog_checks/vespa/data/conf.yaml.example) file and copy its content in the `conf.yaml` file.
1. Edit the `conf.yaml` file to configure the `consumer`, which decides the set of metrics forwarded by the check:
   - `consumer`: The consumer to collect metrics for, either `default` or a [custom consumer](https://docs.vespa.ai/documentation/reference/services-admin.html#metrics) from your Vespa application's services.xml.
1. [Restart the Agent](https://docs.datadoghq.com/agent/guide/agent-commands.md#start-stop-and-restart-the-agent).

### Validation{% #validation %}

Run the [Agent's status subcommand](https://docs.datadoghq.com/agent/guide/agent-commands.md#agent-status-and-information) and look for `vespa` under the Checks section.

## Data Collected{% #data-collected %}

### Metrics{% #metrics %}

|  |
|  |
| **vespa.http.status.1xx.rate**(gauge)                                                        | Number of responses with a 1xx status*Shown as response*                   |
| **vespa.http.status.2xx.rate**(gauge)                                                        | Number of responses with a 2xx status*Shown as response*                   |
| **vespa.http.status.3xx.rate**(gauge)                                                        | Number of responses with a 3xx status*Shown as response*                   |
| **vespa.http.status.4xx.rate**(gauge)                                                        | Number of responses with a 4xx status*Shown as response*                   |
| **vespa.http.status.5xx.rate**(gauge)                                                        | Number of responses with a 5xx status*Shown as response*                   |
| **vespa.jdisc.gc.ms.average**(gauge)                                                         | Time spent in GC*Shown as millisecond*                                     |
| **vespa.mem.heap.free.average**(gauge)                                                       | Free heap size*Shown as byte*                                              |
| **vespa.queries.rate**(gauge)                                                                | Number of search queries*Shown as query*                                   |
| **vespa.feed.operations.rate**(gauge)                                                        | Number of feed operations*Shown as operation*                              |
| **vespa.query\_latency.average**(gauge)                                                      | Total query processing time*Shown as millisecond*                          |
| **vespa.query\_latency.95percentile**(gauge)                                                 | 95 percentile total query processing time*Shown as millisecond*            |
| **vespa.query\_latency.99percentile**(gauge)                                                 | 99 percentile total query processing time*Shown as millisecond*            |
| **vespa.hits\_per\_query.average**(gauge)                                                    | Hits in the returned result, per query*Shown as hit*                       |
| **vespa.totalhits\_per\_query.average**(gauge)                                               | Estimated total number of hits per query*Shown as hit*                     |
| **vespa.degraded\_queries.rate**(gauge)                                                      | Queries with degraded results due to timeout*Shown as query*               |
| **vespa.failed\_queries.rate**(gauge)                                                        | Failed queries*Shown as query*                                             |
| **vespa.serverActiveThreads.average**(gauge)                                                 | Threads that are active processing requests*Shown as thread*               |
| **vespa.content.proton.search\_protocol.docsum.requested\_documents.rate**(gauge)            | Requested document summaries*Shown as document*                            |
| **vespa.content.proton.search\_protocol.docsum.latency.average**(gauge)                      | Docsum request latency on content node*Shown as second*                    |
| **vespa.content.proton.search\_protocol.query.latency.average**(gauge)                       | Query request latency on content node*Shown as second*                     |
| **vespa.content.proton.documentdb.documents.total.last**(gauge)                              | Total documents in this document db (ready + not-ready)*Shown as document* |
| **vespa.content.proton.documentdb.documents.ready.last**(gauge)                              | Ready documents in this document db*Shown as document*                     |
| **vespa.content.proton.documentdb.documents.active.last**(gauge)                             | Active/searchable documents in this document db*Shown as document*         |
| **vespa.content.proton.documentdb.disk\_usage.last**(gauge)                                  | Total disk usage for this document db*Shown as byte*                       |
| **vespa.content.proton.documentdb.memory\_usage.allocated\_bytes.last**(gauge)               | Total memory usage for this document db*Shown as byte*                     |
| **vespa.content.proton.resource\_usage.disk.average**(gauge)                                 | Relative amount of disk space used by this process*Shown as fraction*      |
| **vespa.content.proton.resource\_usage.memory.average**(gauge)                               | Relative amount of memory used by this process*Shown as fraction*          |
| **vespa.content.proton.resource\_usage.feeding\_blocked.last**(gauge)                        | Whether feeding is blocked due to resource limitations (value is 0 or 1)   |
| **vespa.content.proton.documentdb.matching.docs\_matched.rate**(gauge)                       | Number of documents matched*Shown as document*                             |
| **vespa.content.proton.documentdb.matching.docs\_reranked.rate**(gauge)                      | Number of documents re-ranked (second phase)*Shown as document*            |
| **vespa.content.proton.documentdb.matching.rank\_profile.query\_latency.average**(gauge)     | Total latency when matching and ranking a query*Shown as second*           |
| **vespa.content.proton.documentdb.matching.rank\_profile.query\_setup\_time.average**(gauge) | Average time spent setting up and tearing down queries*Shown as second*    |
| **vespa.content.proton.documentdb.matching.rank\_profile.rerank\_time.average**(gauge)       | Time spent on 2nd phase ranking*Shown as second*                           |
| **vespa.content.proton.transactionlog.disk\_usage.last**(gauge)                              | Disk usage of the transaction log*Shown as byte*                           |

### Events{% #events %}

The Vespa integration does not include any events.

### Service Checks{% #service-checks %}

**vespa.metrics\_health**

Returns `CRITICAL` if there is no response from the Vespa Node metrics API. Returns `WARNING` if there is a response from the Vespa Node metrics API but there was an error in processing, otherwise returns `OK`.

*Statuses: ok, warning, critical*

**vespa.process\_health**

For each Vespa process, returns `CRITICAL` if the process seems to be down. Returns `WARNING` if the process status is unknown, otherwise returns `OK`.

*Statuses: ok, warning, critical*

## Troubleshooting{% #troubleshooting %}

Need help? Contact [Datadog support](https://docs.datadoghq.com/help/).
