Airbyte

Supported OS Linux Windows Mac OS

Overview

This check monitors Airbyte. Metrics are sent to Datadog through DogStatsD.

Setup

Installation

All steps below are needed for the Airbyte integration to work properly. Before you begin, install the Datadog Agent version >=6.17 or >=7.17, which includes the StatsD/DogStatsD mapping feature.

Configuration

  1. Configure your Airbyte deployment to send metrics to Datadog.
  2. Update the Datadog Agent main configuration file datadog.yaml by adding the following configuration:
dogstatsd_mapper_profiles:
  - name: airbyte_worker
    prefix: "worker."
    mappings:
      - match: "worker.temporal_workflow_*"
        name: "airbyte.worker.temporal_workflow.$1"
      - match: "worker.worker_*"
        name: "airbyte.worker.$1"
      - match: "worker.state_commit_*"
        name: "airbyte.worker.state_commit.$1"
      - match: "worker.job_*"
        name: "airbyte.worker.job.$1"
      - match: "worker.attempt_*"
        name: "airbyte.worker.attempt.$1"
      - match: "worker.activity_*"
        name: "airbyte.worker.activity.$1"
      - match: "worker.*"
        name: "airbyte.worker.$1"
  - name: airbyte_cron
    prefix: "cron."
    mappings:
      - match: "cron.cron_jobs_run"
        name: "airbyte.cron.jobs_run"
      - match: "cron.*"
        name: "airbyte.cron.$1"
  - name: airbyte_metrics_reporter
    prefix: "metrics-reporter."
    mappings:
      - match: "metrics-reporter.*"
        name: "airbyte.metrics_reporter.$1"
  - name: airbyte_orchestrator
    prefix: "orchestrator."
    mappings:
      - match: "orchestrator.*"
        name: "airbyte.orchestrator.$1"
  - name: airbyte_server
    prefix: "server."
    mappings:
      - match: "server.*"
        name: "airbyte.server.$1"
  - name: airbyte_general
    prefix: "airbyte."
    mappings:
      - match: "airbyte.worker.temporal_workflow_*"
        name: "airbyte.worker.temporal_workflow.$1"
      - match: "airbyte.worker.worker_*"
        name: "airbyte.worker.$1"
      - match: "airbyte.worker.state_commit_*"
        name: "airbyte.worker.state_commit.$1"
      - match: "airbyte.worker.job_*"
        name: "airbyte.worker.job.$1"
      - match: "airbyte.worker.attempt_*"
        name: "airbyte.worker.attempt.$1"
      - match: "airbyte.worker.activity_*"
        name: "airbyte.worker.activity.$1"
      - match: "airbyte.cron.cron_jobs_run"
        name: "airbyte.cron.jobs_run"
  1. Restart the Agent and Airbyte.

Data Collected

Metrics

airbyte.cron.jobs_run
(count)
Number of CRON runs by CRON type.
airbyte.cron.workflows_healed
(count)
Number of workflows the self-healing CRON healed.
airbyte.metrics_reporter.est_num_metrics_emitted_by_reporter
(count)
Estimated metrics emitted by the reporter in the last interval. This is estimated since the count is not precise.
airbyte.metrics_reporter.num_orphan_running_jobs
(gauge)
Number of jobs reported as running that are associated to an inactive or deprecated connection.
Shown as job
airbyte.metrics_reporter.num_pending_jobs
(gauge)
Number of pending jobs.
Shown as job
airbyte.metrics_reporter.num_running_jobs
(gauge)
Number of running jobs.
Shown as job
airbyte.metrics_reporter.num_total_scheduled_syncs_last_day
(gauge)
Number of total sync jobs runs in last day.
Shown as job
airbyte.metrics_reporter.num_unusually_long_syncs
(gauge)
Number of unusual long sync jobs compared to their historic performance.
Shown as job
airbyte.metrics_reporter.oldest_pending_job_age_secs
(gauge)
The age of the oldest pending job in seconds.
Shown as second
airbyte.metrics_reporter.oldest_running_job_age_secs
(gauge)
The age of the oldest running job in seconds.
Shown as second
airbyte.orchestrator.source_hearbeat_failure
(count)
The count of replication failures due to a source missing an heartbeat.
airbyte.server.breaking_change_detected
(count)
The count of breaking schema changes detected.
airbyte.server.schema_change_auto_propagated
(count)
The count of schema changes that have been propagated.
airbyte.worker.activity.check_connection
(count)
The count of check connection activity started.
Shown as connection
airbyte.worker.activity.dbt_transformation
(count)
The count of DBT transformation activity started.
airbyte.worker.activity.discover_catalog
(count)
The count of discover catalog activity started.
airbyte.worker.activity.failure
(count)
The count of activity fails. Tagged by activity.
airbyte.worker.activity.normalization
(count)
The count of normalization activity started.
airbyte.worker.activity.normalization_summary_check
(count)
The count of normalization summary check activity started.
airbyte.worker.activity.refresh_schema
(count)
The count of refresh schema activity started.
airbyte.worker.activity.replication
(count)
The count of replication activity started.
airbyte.worker.activity.spec
(count)
The count of spec activity started.
airbyte.worker.activity.submit_check_destination_connection
(count)
The count of submit check connection activities started.
Shown as connection
airbyte.worker.activity.submit_check_source_connection
(count)
The count of submit check connection activities started.
Shown as connection
airbyte.worker.activity.webhook_operation
(count)
The count of webhook operation activity started.
airbyte.worker.attempt.completed
(count)
The count of new attempts completed. One is emitted per attempt.
Shown as attempt
airbyte.worker.attempt.created
(count)
The count of new attempts created. One is emitted per attempt.
Shown as attempt
airbyte.worker.attempt.created_by_release_stage
(count)
The count of new attempts created. Attempts are double counted as this is tagged by release stage.
Shown as attempt
airbyte.worker.attempt.failed_by_failure_origin
(count)
The count of failure origins a failed attempt has. Since a failure can have multiple origins, a single failure can be counted more than once. Tagged by failure origin and failure type.
Shown as attempt
airbyte.worker.attempt.failed_by_release_stage
(count)
The count of attempt failed. Attempts are double counted as this is tagged by release stage.
Shown as attempt
airbyte.worker.attempt.succeeded_by_release_stage
(count)
The count of attempts succeeded. Attempts are double counted as this is tagged by release stage.
Shown as attempt
airbyte.worker.destination_buffer_size
(gauge)
The size of the replication worker destination buffer queue.
Shown as record
airbyte.worker.destination_message_read
(count)
The count of messages read from the destination.
Shown as message
airbyte.worker.destination_message_sent
(count)
The count of messages sent to the destination.
Shown as message
airbyte.worker.job.cancelled_by_release_stage
(count)
The count of jobs cancelled. Jobs are double counted as this is tagged by release stage.
Shown as job
airbyte.worker.job.created_by_release_stage
(count)
The count of new jobs created. Jobs are double counted as this is tagged by release stage.
Shown as job
airbyte.worker.job.failed_by_release_stage
(count)
The count of job fails. Jobs are double counted as this is tagged by release stage.
Shown as job
airbyte.worker.job.succeeded_by_release_stage
(count)
The count of job succeeds. Jobs are double counted as this is tagged by release stage.
Shown as job
airbyte.worker.notifications_sent
(count)
Number of notifications sent.
airbyte.worker.replication_bytes_synced
(count)
Number of bytes synced during replication.
Shown as byte
airbyte.worker.replication_records_synced
(count)
Number of records synced during replication.
Shown as record
airbyte.worker.source_buffer_size
(gauge)
The size of the replication worker source buffer queue.
Shown as record
airbyte.worker.source_message_read
(count)
The count of messages read from the source.
Shown as message
airbyte.worker.state_commit.close_successful
(count)
Number of final to connection exiting with a successful final state flush.
airbyte.worker.state_commit.not_attempted
(count)
Number of attempts to commit states dropped due to an early termination.
Shown as attempt
airbyte.worker.temporal_workflow.attempt
(count)
The count of temporal workflow attempts.
Shown as attempt
airbyte.worker.temporal_workflow.failure
(count)
The count of temporal workflow failures.
airbyte.worker.temporal_workflow.success
(count)
The count of temporal successful workflow syncs.
Shown as success

Service Checks

The Airbyte check does not include any service checks.

Events

The Airbyte check does not include any events.

Troubleshooting

Need help? Contact Datadog support.