---
title: Marathon
description: 'Track application metrics: required memory and disk, instance count, and more.'
breadcrumbs: Docs > Integrations > Marathon
---

# Marathon
Supported OS Integration version5.4.0
## Overview{% #overview %}

The Agent's Marathon check lets you:

- Track the state and health of every application: see configured memory, disk, cpu, and instances; monitor the number of healthy and unhealthy tasks
- Monitor the number of queued applications and the number of deployments

**Minimum Agent version:** 6.0.0

## Setup{% #setup %}

### Installation{% #installation %}

The Marathon check is included in the [Datadog Agent](https://app.datadoghq.com/account/settings/agent/latest) package. No additional installation is needed on your server.

### Configuration{% #configuration %}

Follow the instructions below to configure this check for an Agent running on a host. For containerized environments, see the Containerized section.

{% tab title="Host" %}
#### Host{% #host %}

To configure this check for an Agent running on a host:

##### Metrics collection{% #metrics-collection %}

1. Edit the `marathon.d/conf.yaml` file, in the `conf.d/` folder at the root of your [Agent's configuration directory](https://docs.datadoghq.com/agent/guide/agent-configuration-files.md#agent-configuration-directory). See the [sample marathon.d/conf.yaml](https://github.com/DataDog/integrations-core/blob/master/marathon/datadog_checks/marathon/data/conf.yaml.example) for all available configuration options:

   ```yaml
   init_config:
   
   instances:
     # the API endpoint of your Marathon master; required
     - url: "https://<SERVER>:<PORT>"
       # if your Marathon master requires ACS auth
       #   acs_url: https://<SERVER>:<PORT>
   
       # the username for Marathon API or ACS token authentication
       username: "<USERNAME>"
   
       # the password for Marathon API or ACS token authentication
       password: "<PASSWORD>"
   ```

The function of `username` and `password` depends on whether or not you configure `acs_url`. If you do, the Agent uses them to request an authentication token from ACS, which it then uses to authenticate to the Marathon API. Otherwise, the Agent uses `username` and `password` to directly authenticate to the Marathon API.

1. [Restart the Agent](https://docs.datadoghq.com/agent/guide/agent-commands.md#start-stop-and-restart-the-agent).

##### Log collection{% #log-collection %}

*Available for Agent versions >6.0*

1. Collecting logs is disabled by default in the Datadog Agent, enable it in your `datadog.yaml` file:

   ```yaml
   logs_enabled: true
   ```

1. Because Marathon uses logback, you can specify a custom log format. With Datadog, two formats are supported out of the box: the default one provided by Marathon and the Datadog recommended format. Add a file appender to your configuration as in the following example and replace `$PATTERN$` with your selected format:

   - Marathon default: `[%date] %-5level %message \(%logger:%thread\)%n`
   - Datadog recommended: `%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n`

   ```xml
     <?xml version="1.0" encoding="UTF-8"?>
   
     <configuration>
         <shutdownHook class="ch.qos.logback.core.hook.DelayingShutdownHook"/>
         <appender name="stdout" class="ch.qos.logback.core.ConsoleAppender">
             <encoder>
                 <pattern>[%date] %-5level %message \(%logger:%thread\)%n</pattern>
             </encoder>
         </appender>
         <appender name="async" class="ch.qos.logback.classic.AsyncAppender">
             <appender-ref ref="stdout" />
             <queueSize>1024</queueSize>
         </appender>
         <appender name="FILE" class="ch.qos.logback.core.FileAppender">
             <file>/var/log/marathon.log</file>
             <append>true</append>
             <!-- set immediateFlush to false for much higher logging throughput -->
             <immediateFlush>true</immediateFlush>
             <encoder>
                 <pattern>$PATTERN$</pattern>
             </encoder>
         </appender>
         <root level="INFO">
             <appender-ref ref="async"/>
             <appender-ref ref="FILE"/>
         </root>
     </configuration>
   ```

1. Add this configuration block to your `marathon.d/conf.yaml` file to start collecting your Marathon logs:

   ```yaml
   logs:
     - type: file
       path: /var/log/marathon.log
       source: marathon
       service: "<SERVICE_NAME>"
   ```

1. [Restart the Agent](https://docs.datadoghq.com/agent/guide/agent-commands.md#start-stop-and-restart-the-agent).

{% /tab %}

{% tab title="Containerized" %}
#### Containerized{% #containerized %}

For containerized environments, see the [Autodiscovery Integration Templates](https://docs.datadoghq.com/agent/kubernetes/integrations.md) for guidance on applying the parameters below.

##### Metric collection{% #metric-collection %}

| Parameter            | Value                                  |
| -------------------- | -------------------------------------- |
| `<INTEGRATION_NAME>` | `marathon`                             |
| `<INIT_CONFIG>`      | blank or `{}`                          |
| `<INSTANCE_CONFIG>`  | `{"url": "https://%%host%%:%%port%%"}` |

##### Log collection{% #log-collection %}

*Available for Agent versions >6.0*

Collecting logs is disabled by default in the Datadog Agent. To enable it, see [Kubernetes Log Collection](https://docs.datadoghq.com/agent/kubernetes/log.md).

| Parameter      | Value                                                 |
| -------------- | ----------------------------------------------------- |
| `<LOG_CONFIG>` | `{"source": "marathon", "service": "<SERVICE_NAME>"}` |

{% /tab %}

### Validation{% #validation %}

[Run the Agent's status subcommand](https://docs.datadoghq.com/agent/guide/agent-commands.md#agent-status-and-information) and look for `marathon` under the Checks section.

## Data Collected{% #data-collected %}

### Metrics{% #metrics %}

|  |
|  |
| **marathon.apps**(gauge)                       | Number of applications                                                                                      |
| **marathon.backoffFactor**(gauge)              | Backoff time multiplication factor for each consecutive failed task launch; tagged by app_id and version    |
| **marathon.backoffSeconds**(gauge)             | Task backoff period; tagged by app_id and version*Shown as second*                                          |
| **marathon.cpus**(gauge)                       | Configured CPUs for each instance of a given application                                                    |
| **marathon.deployments**(gauge)                | Number of running or pending deployments                                                                    |
| **marathon.disk**(gauge)                       | Configured CPU for each instance of a given application*Shown as mebibyte*                                  |
| **marathon.instances**(gauge)                  | Number of instances of a given application; tagged by app_id and version                                    |
| **marathon.mem**(gauge)                        | Configured memory for each instance of a given application; tagged by app_id and version*Shown as mebibyte* |
| **marathon.queue.count**(gauge)                | Number of instances left to launch*Shown as task*                                                           |
| **marathon.queue.delay**(gauge)                | Wait before the next launch attempt*Shown as second*                                                        |
| **marathon.queue.offers.processed**(gauge)     | The number of processed offers for this launch attempt*Shown as task*                                       |
| **marathon.queue.offers.reject.last**(gauge)   | Summary of unused offers for all last offers*Shown as task*                                                 |
| **marathon.queue.offers.reject.launch**(gauge) | Summary of unused offers for the launch attempt*Shown as task*                                              |
| **marathon.queue.offers.unused**(gauge)        | The number of unused offers for this launch attempt*Shown as task*                                          |
| **marathon.queue.size**(gauge)                 | Number of app offer queues*Shown as task*                                                                   |
| **marathon.taskRateLimit**(gauge)              | The task rate limit for a given application; tagged by app_id and version                                   |
| **marathon.tasksHealthy**(gauge)               | Number of healthy tasks for a given application; tagged by app_id and version*Shown as task*                |
| **marathon.tasksRunning**(gauge)               | Number of tasks running for a given application; tagged by app_id and version*Shown as task*                |
| **marathon.tasksStaged**(gauge)                | Number of tasks staged for a given application; tagged by app_id and version*Shown as task*                 |
| **marathon.tasksUnhealthy**(gauge)             | Number of unhealthy tasks for a given application; tagged by app_id and version*Shown as task*              |

### Events{% #events %}

The Marathon check does not include any events.

### Service Checks{% #service-checks %}

**marathon.can\_connect**

CRITICAL if either cannot connect to API endpoint or no instances of any application are running. WARN if no applications are detected. Additional information about response status at the time of collection is included in the check message.

*Statuses: ok, critical*

## Troubleshooting{% #troubleshooting %}

Need help? Contact [Datadog support](https://docs.datadoghq.com/help/).
