For AI agents: A markdown version of this page is available at https://docs.datadoghq.com/integrations/control-m.md. A documentation index is available at /llms.txt.

Control-M

Supported OS Linux Windows Mac OS

Integration version1.0.0

To find out if this integration is available in your organization, see your Datadog Integrations page or ask your organization administrator.

To initiate an exception request to enable this integration for your organization, email support@ddog-gov.com.

Overview

This check monitors Control-M through the Datadog Agent.

Control-M is a workload automation platform that orchestrates batch jobs, file transfers, and application workflows across on-premises and cloud environments. This integration connects to the Control-M Automation API to collect server health, job execution metrics, and completion events, giving you visibility into your scheduling infrastructure from within Datadog.

The integration provides:

  • Server health monitoring: Track which Control-M servers are up or disconnected.
  • Job rollup metrics: Total, active, waiting, and per-status breakdowns across all servers.
  • Per-job completion tracking: Run counts and durations for terminal jobs (ended OK, ended not OK, cancelled), with deduplication across check cycles.
  • Events: Optional Datadog events for job failures, cancellations, slow runs, and (opt-in) successes.

Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.

Installation

The Control-M check is included in the Datadog Agent package. No additional installation is needed on your server.

Configuration

Edit the control_m.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory.

Minimum configuration (static token)

instances:
  - control_m_api_endpoint: https://your-controlm-host:8443/automation-api
    headers:
      Authorization: Bearer <YOUR_API_TOKEN>

Session-login authentication

If your environment uses username and password authentication instead of a static token:

instances:
  - control_m_api_endpoint: https://your-controlm-host:8443/automation-api
    control_m_username: <USERNAME>
    control_m_password: <PASSWORD>

When both headers (with an Authorization key) and credentials are configured, the check tries the static token first. If the API responds with a 401, it falls back to session login automatically.

Histogram: control_m.job.run.duration_ms

The job.run.duration_ms metric is submitted as a histogram. The Datadog Agent expands it into multiple aggregated metrics based on the histogram_aggregates and histogram_percentiles settings in the main datadog.yaml file:

Generated metricTypeDefault
control_m.job.run.duration_ms.avggaugeEnabled
control_m.job.run.duration_ms.countrateEnabled
control_m.job.run.duration_ms.maxgaugeEnabled
control_m.job.run.duration_ms.mediangaugeEnabled
control_m.job.run.duration_ms.95percentilegaugeEnabled

To customize which aggregations are produced, edit the histogram_aggregates and histogram_percentiles options in your datadog.yaml file:

histogram_aggregates:
  - max
  - median
  - avg
  - count

histogram_percentiles:
  - "0.95"

These settings are Agent-level and apply to all histograms from all integrations.

Events

When emit_job_events is enabled, the check emits Datadog events for terminal job completions:

Event typeAlert typeTrigger
control_m.job.completionerrorJob ended not OK.
control_m.job.completionwarningJob cancelled.
control_m.job.completionsuccessJob ended OK (only when emit_success_events: true).
control_m.job.slow_runwarningJob duration exceeds slow_run_threshold_ms.

Events include high-cardinality details in the body: job ID, run number, folder, type, start time, and duration.

Events respect deduplication - the same job and run combination only fires an event on the first check cycle it appears.

Optional settings

instances:
  - control_m_api_endpoint: https://your-controlm-host:8443/automation-api
    headers:
      Authorization: Bearer <YOUR_API_TOKEN>

    # Events
    emit_job_events: true            # Emit Datadog events for job completions (default: false)
    emit_success_events: false       # Include success events, not just failures/cancellations (default: false)
    slow_run_threshold_ms: 3600000   # Flag jobs slower than this as slow_run events (default: none)

    # Job filtering
    job_status_limit: 10000          # Max jobs per API call (default: 10000, server max)
    job_name_filter: '*'             # Wildcard filter for job names (default: *)

    # Session token tuning
    token_lifetime_seconds: 1800     # Assumed token lifetime (default: 1800)
    token_refresh_buffer_seconds: 300  # Refresh this many seconds before expiry (default: 300)

    # Deduplication TTLs
    finalized_ttl_seconds: 86400     # How long to remember completed jobs (default: 24h)
    active_ttl_seconds: 21600        # How long to track active jobs (default: 6h)

See the sample control_m.d/conf.yaml for all available configuration options.

Restart the Agent after making changes.

Validation

Run the Agent’s status subcommand and look for control_m under the Checks section.

$ datadog-agent status
  ...
  control_m (1.0.0)
  -----------------
    Instance ID: control_m:abc1234 [OK]
    Configuration Source: file:/etc/datadog-agent/conf.d/control_m.d/conf.yaml
    Total Runs: 42
    Metric Samples: Last Run: 15, Total: 630
    Events: Last Run: 0, Total: 3
    Service Checks: Last Run: 1, Total: 42
    Average Execution Time: 245ms

Troubleshooting

The can_connect metric reports 0

  1. Verify the control_m_api_endpoint is reachable from the Agent host: curl -s -o /dev/null -w '%{http_code}' https://your-host:8443/automation-api/config/servers -H 'Authorization: Bearer <TOKEN>'
  2. Check that the API token or credentials are valid.
  3. If TLS verification is failing, set tls_verify: false temporarily to confirm, then fix the certificate chain.

Metrics show fewer jobs than expected

The API has a server-enforced maximum of 10,000 jobs per request. If jobs.total exceeds jobs.returned, some jobs are being truncated. Consider using job_name_filter to narrow the scope.

Events are not appearing

Verify emit_job_events: true is set in the instance configuration. Success events require both emit_job_events: true and emit_success_events: true.

Events respect deduplication: a job reported in a previous check cycle does not fire again.

Duplicate metrics after Agent restart

The check persists dedup state to the Agent’s cache. If the cache was cleared (for example, after a clean reinstall), previously reported terminal jobs may be re-emitted once. Increase finalized_ttl_seconds if completed jobs remain visible in the Control-M status feed for longer than 24 hours.

Data Collected

Metrics

control_m.can_connect
(gauge)
Control-M API connectivity status (1 when API is reachable, 0 otherwise).
control_m.can_login
(gauge)
Control-M session login status (1 when authentication succeeds, 0 otherwise). Only emitted in session-login mode.
control_m.job.overrun_ms
(gauge)
How far past its estimated end time an actively executing job is running. Only emitted for jobs with an estimatedEndTime that have exceeded it.
Shown as millisecond
control_m.job.run.count
(count)
Count of terminal job runs observed in the status feed.
Shown as job
control_m.job.run.duration_ms
(gauge)
Submitted as a histogram. The Agent expands this into aggregated metrics (avg, count, max, median, 95percentile) controlled by histogram_aggregates and histogram_percentiles in datadog.yaml.
Shown as millisecond
control_m.job.run.overrun_ms
(gauge)
Submitted as a histogram at job completion. How far past its estimated end time the job ran. Only emitted for terminal jobs that exceeded their estimatedEndTime.
Shown as millisecond
control_m.jobs.active
(gauge)
Current number of active (non-terminal) jobs.
Shown as job
control_m.jobs.by_status
(gauge)
Current number of jobs per normalized Control-M status.
Shown as job
control_m.jobs.returned
(gauge)
Number of job entries returned in the current status response.
Shown as job
control_m.jobs.total
(gauge)
Total number of jobs reported by the Control-M API (from the response total field).
Shown as job
control_m.jobs.waiting.by_server
(gauge)
Number of waiting jobs per Control-M server.
Shown as job
control_m.jobs.waiting.total
(gauge)
Total number of waiting jobs across all servers.
Shown as job
control_m.server.up
(gauge)
Whether the Control-M server is up (1) or down (0).

Uninstallation

To uninstall the Control-M integration, remove the control_m.d/conf.yaml file from your Agent’s conf.d/ directory and restart the Agent.

Support

Need help? Contact Datadog Support.