---
title: Control-M
description: Monitor Control-M servers and jobs
breadcrumbs: Docs > Integrations > Control-M
---

# Control-M
Supported OS Integration version1.0.0
{% callout %}
# Important note for users on the following Datadog sites: us2.ddog-gov.com

{% alert level="info" %}
To find out if this integration is available in your organization, see your [Datadog Integrations](https://app.datadoghq.com/integrations) page or ask your organization administrator.

To initiate an exception request to enable this integration for your organization, email [support@ddog-gov.com](mailto:support@ddog-gov.com).
{% /alert %}

{% /callout %}
        Control-M OverviewTrack all jobsTrack job completionsExamine job failures
## Overview{% #overview %}

This check monitors Control-M through the Datadog Agent.

Control-M is a workload automation platform that orchestrates batch jobs, file transfers, and application workflows across on-premises and cloud environments. This integration connects to the Control-M Automation API to collect server health, job execution metrics, and completion events, giving you visibility into your scheduling infrastructure from within Datadog.

The integration provides:

- **Server health monitoring**: Track which Control-M servers are up or disconnected.
- **Job rollup metrics**: Total, active, waiting, and per-status breakdowns across all servers.
- **Per-job completion tracking**: Run counts and durations for terminal jobs (ended OK, ended not OK, cancelled), with deduplication across check cycles.
- **Events**: Optional Datadog events for job failures, cancellations, slow runs, and (opt-in) successes.

## Setup{% #setup %}

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the [Autodiscovery Integration Templates](https://docs.datadoghq.com/containers/kubernetes/integrations.md) for guidance on applying these instructions.

### Installation{% #installation %}

The Control-M check is included in the [Datadog Agent](https://app.datadoghq.com/account/settings/agent/latest) package. No additional installation is needed on your server.

### Configuration{% #configuration %}

Edit the `control_m.d/conf.yaml` file, in the `conf.d/` folder at the root of your Agent's configuration directory.

#### Minimum configuration (static token){% #minimum-configuration-static-token %}

```yaml
instances:
  - control_m_api_endpoint: https://your-controlm-host:8443/automation-api
    headers:
      Authorization: Bearer <YOUR_API_TOKEN>
```

#### Session-login authentication{% #session-login-authentication %}

If your environment uses username and password authentication instead of a static token:

```yaml
instances:
  - control_m_api_endpoint: https://your-controlm-host:8443/automation-api
    control_m_username: <USERNAME>
    control_m_password: <PASSWORD>
```

When both `headers` (with an `Authorization` key) and credentials are configured, the check tries the static token first. If the API responds with a 401, it falls back to session login automatically.

#### Histogram: `control_m.job.run.duration_ms`{% #histogram-control_mjobrunduration_ms %}

The `job.run.duration_ms` metric is submitted as a [histogram](https://docs.datadoghq.com/metrics/types.md?tab=histogram). The Datadog Agent expands it into multiple aggregated metrics based on the [`histogram_aggregates`](https://github.com/DataDog/datadog-agent/blob/3697d3b93bde62e2c3bf039e170ce69e49ef5294/pkg/config/config_template.yaml#L225-L242) and [`histogram_percentiles`](https://github.com/DataDog/datadog-agent/blob/3697d3b93bde62e2c3bf039e170ce69e49ef5294/pkg/config/config_template.yaml#L225-L242) settings in the main `datadog.yaml` file:

| Generated metric                             | Type  | Default |
| -------------------------------------------- | ----- | ------- |
| `control_m.job.run.duration_ms.avg`          | gauge | Enabled |
| `control_m.job.run.duration_ms.count`        | rate  | Enabled |
| `control_m.job.run.duration_ms.max`          | gauge | Enabled |
| `control_m.job.run.duration_ms.median`       | gauge | Enabled |
| `control_m.job.run.duration_ms.95percentile` | gauge | Enabled |

To customize which aggregations are produced, edit the `histogram_aggregates` and `histogram_percentiles` options in your `datadog.yaml` file:

```yaml
histogram_aggregates:
  - max
  - median
  - avg
  - count

histogram_percentiles:
  - "0.95"
```

These settings are Agent-level and apply to all histograms from all integrations.

#### Events{% #events %}

When `emit_job_events` is enabled, the check emits Datadog events for terminal job completions:

| Event type                 | Alert type | Trigger                                               |
| -------------------------- | ---------- | ----------------------------------------------------- |
| `control_m.job.completion` | `error`    | Job ended not OK.                                     |
| `control_m.job.completion` | `warning`  | Job cancelled.                                        |
| `control_m.job.completion` | `success`  | Job ended OK (only when `emit_success_events: true`). |
| `control_m.job.slow_run`   | `warning`  | Job duration exceeds `slow_run_threshold_ms`.         |

Events include high-cardinality details in the body: job ID, run number, folder, type, start time, and duration.

Events respect deduplication - the same job and run combination only fires an event on the first check cycle it appears.

#### Optional settings{% #optional-settings %}

```yaml
instances:
  - control_m_api_endpoint: https://your-controlm-host:8443/automation-api
    headers:
      Authorization: Bearer <YOUR_API_TOKEN>

    # Events
    emit_job_events: true            # Emit Datadog events for job completions (default: false)
    emit_success_events: false       # Include success events, not just failures/cancellations (default: false)
    slow_run_threshold_ms: 3600000   # Flag jobs slower than this as slow_run events (default: none)

    # Job filtering
    job_status_limit: 10000          # Max jobs per API call (default: 10000, server max)
    job_name_filter: '*'             # Wildcard filter for job names (default: *)

    # Session token tuning
    token_lifetime_seconds: 1800     # Assumed token lifetime (default: 1800)
    token_refresh_buffer_seconds: 300  # Refresh this many seconds before expiry (default: 300)

    # Deduplication TTLs
    finalized_ttl_seconds: 86400     # How long to remember completed jobs (default: 24h)
    active_ttl_seconds: 21600        # How long to track active jobs (default: 6h)
```

See the [sample control_m.d/conf.yaml](https://github.com/DataDog/integrations-core/blob/master/control_m/datadog_checks/control_m/data/conf.yaml.example) for all available configuration options.

[Restart the Agent](https://docs.datadoghq.com/agent/configuration/agent-commands.md#start-stop-and-restart-the-agent) after making changes.

### Validation{% #validation %}

[Run the Agent's status subcommand](https://docs.datadoghq.com/agent/configuration/agent-commands.md#agent-status-and-information) and look for `control_m` under the Checks section.

```
$ datadog-agent status
  ...
  control_m (1.0.0)
  -----------------
    Instance ID: control_m:abc1234 [OK]
    Configuration Source: file:/etc/datadog-agent/conf.d/control_m.d/conf.yaml
    Total Runs: 42
    Metric Samples: Last Run: 15, Total: 630
    Events: Last Run: 0, Total: 3
    Service Checks: Last Run: 1, Total: 42
    Average Execution Time: 245ms
```

### Troubleshooting{% #troubleshooting %}

#### The `can_connect` metric reports 0{% #the-can_connect-metric-reports-0 %}

1. Verify the `control_m_api_endpoint` is reachable from the Agent host: `curl -s -o /dev/null -w '%{http_code}' https://your-host:8443/automation-api/config/servers -H 'Authorization: Bearer <TOKEN>'`
1. Check that the API token or credentials are valid.
1. If TLS verification is failing, set `tls_verify: false` temporarily to confirm, then fix the certificate chain.

#### Metrics show fewer jobs than expected{% #metrics-show-fewer-jobs-than-expected %}

The API has a server-enforced maximum of 10,000 jobs per request. If `jobs.total` exceeds `jobs.returned`, some jobs are being truncated. Consider using `job_name_filter` to narrow the scope.

#### Events are not appearing{% #events-are-not-appearing %}

Verify `emit_job_events: true` is set in the instance configuration. Success events require both `emit_job_events: true` and `emit_success_events: true`.

Events respect deduplication: a job reported in a previous check cycle does not fire again.

#### Duplicate metrics after Agent restart{% #duplicate-metrics-after-agent-restart %}

The check persists dedup state to the Agent's cache. If the cache was cleared (for example, after a clean reinstall), previously reported terminal jobs may be re-emitted once. Increase `finalized_ttl_seconds` if completed jobs remain visible in the Control-M status feed for longer than 24 hours.

## Data Collected{% #data-collected %}

### Metrics{% #metrics %}

|  |
|  |
| **control\_m.can\_connect**(gauge)            | Control-M API connectivity status (1 when API is reachable, 0 otherwise).                                                                                                                                            |
| **control\_m.can\_login**(gauge)              | Control-M session login status (1 when authentication succeeds, 0 otherwise). Only emitted in session-login mode.                                                                                                    |
| **control\_m.job.overrun\_ms**(gauge)         | How far past its estimated end time an actively executing job is running. Only emitted for jobs with an estimatedEndTime that have exceeded it.*Shown as millisecond*                                                |
| **control\_m.job.run.count**(count)           | Count of terminal job runs observed in the status feed.*Shown as job*                                                                                                                                                |
| **control\_m.job.run.duration\_ms**(gauge)    | Submitted as a histogram. The Agent expands this into aggregated metrics (avg, count, max, median, 95percentile) controlled by histogram_aggregates and histogram_percentiles in datadog.yaml.*Shown as millisecond* |
| **control\_m.job.run.overrun\_ms**(gauge)     | Submitted as a histogram at job completion. How far past its estimated end time the job ran. Only emitted for terminal jobs that exceeded their estimatedEndTime.*Shown as millisecond*                              |
| **control\_m.jobs.active**(gauge)             | Current number of active (non-terminal) jobs.*Shown as job*                                                                                                                                                          |
| **control\_m.jobs.by\_status**(gauge)         | Current number of jobs per normalized Control-M status.*Shown as job*                                                                                                                                                |
| **control\_m.jobs.returned**(gauge)           | Number of job entries returned in the current status response.*Shown as job*                                                                                                                                         |
| **control\_m.jobs.total**(gauge)              | Total number of jobs reported by the Control-M API (from the response total field).*Shown as job*                                                                                                                    |
| **control\_m.jobs.waiting.by\_server**(gauge) | Number of waiting jobs per Control-M server.*Shown as job*                                                                                                                                                           |
| **control\_m.jobs.waiting.total**(gauge)      | Total number of waiting jobs across all servers.*Shown as job*                                                                                                                                                       |
| **control\_m.server.up**(gauge)               | Whether the Control-M server is up (1) or down (0).                                                                                                                                                                  |

## Uninstallation{% #uninstallation %}

To uninstall the Control-M integration, remove the `control_m.d/conf.yaml` file from your Agent's `conf.d/` directory and restart the Agent.

## Support{% #support %}

Need help? Contact [Datadog Support](https://app.datadoghq.com/help).
