Amazon Managed Workflows para Apache Airflow (MWAA)

Información general

Amazon Managed Workflows para Apache Airflow (MWAA) es un servicio gestionado para Apache Airflow que facilita la creación y administración de flujos de trabajo en la nube.

Habilita esta integración para ver todas tus métricas de Amazon MWAA en Datadog.

Configuración

Instalación

Si aún no lo has hecho, configura primero la integración de Amazon Web Services.

Recopilación de métricas

  1. En la página de la integración de AWS, asegúrate de que MWAA está habilitado en la pestaña Metric Collection.
  2. Instala la integración de Datadog y Amazon Managed Workflows para Apache Airflow (MWAA).

APM

  1. Configura Amazon MWAA para enviar logs a CloudWatch.
  2. Envía los logs a Datadog.

Datos recopilados

Métricas

aws.mwaa.collect_dbdags
(gauge)
Average milliseconds taken for fetching all Serialized Dags from DB. Available in both Airflow v1 and v2.
Shown as millisecond
aws.mwaa.collect_dbdags.maximum
(gauge)
Maximum milliseconds taken for fetching all Serialized Dags from DB. Available in both Airflow v1 and v2.
Shown as millisecond
aws.mwaa.collect_dbdags.minimum
(gauge)
Minimum milliseconds taken for fetching all Serialized Dags from DB. Available in both Airflow v1 and v2.
Shown as millisecond
aws.mwaa.critical_section_busy
(count)
Count of times a scheduler process tried to get a lock on the critical section (needed to send tasks to the executor) and found it locked by another process. Only available in Airflow v2.
Shown as unit
aws.mwaa.critical_section_duration
(gauge)
Average milliseconds spent in the critical section of scheduler loop - only a single scheduler can enter this loop at a time. Only available in Airflow v2.
Shown as millisecond
aws.mwaa.critical_section_duration.maximum
(gauge)
Maximum milliseconds spent in the critical section of scheduler loop - only a single scheduler can enter this loop at a time. Only available in Airflow v2.
Shown as millisecond
aws.mwaa.critical_section_duration.minimum
(gauge)
Minimum milliseconds spent in the critical section of scheduler loop - only a single scheduler can enter this loop at a time. Only available in Airflow v2.
Shown as millisecond
aws.mwaa.dag_bag_size
(count)
Number of DAGs found when the scheduler ran a scan based on it’s configuration. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.dag_callback_exceptions
(count)
Number of exceptions raised from DAG callbacks. When this happens, it means DAG callback is not working. Only available in Airflow v2.
Shown as unit
aws.mwaa.dagdependency_check
(gauge)
Average milliseconds taken to check DAG dependencies. Available in both Airflow v1 and v2.
Shown as millisecond
aws.mwaa.dagdependency_check.maximum
(gauge)
Maximum milliseconds taken to check DAG dependencies. Available in both Airflow v1 and v2.
Shown as millisecond
aws.mwaa.dagdependency_check.minimum
(gauge)
Minimum milliseconds taken to check DAG dependencies. Available in both Airflow v1 and v2.
Shown as millisecond
aws.mwaa.dagduration_failed
(gauge)
Milliseconds taken for a DagRun to reach failed state. Available in both Airflow v1 and v2.
Shown as millisecond
aws.mwaa.dagduration_success
(gauge)
Milliseconds taken for a DagRun to reach success state. Available in both Airflow v1 and v2.
Shown as millisecond
aws.mwaa.dagfile_processing_last_duration
(gauge)
Average milliseconds taken to load the given DAG file. Available in both Airflow v1 and v2.
Shown as millisecond
aws.mwaa.dagfile_processing_last_duration.maximum
(gauge)
Maximum milliseconds taken to load the given DAG file. Available in both Airflow v1 and v2.
Shown as millisecond
aws.mwaa.dagfile_processing_last_duration.minimum
(gauge)
Minimum milliseconds taken to load the given DAG file. Available in both Airflow v1 and v2.
Shown as millisecond
aws.mwaa.dagfile_processing_last_run_seconds_ago
(gauge)
Seconds since was last processed. Available in both Airflow v1 and v2.
Shown as second
aws.mwaa.dagfile_refresh_error
(count)
Number of failures loading any DAG files. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.dagschedule_delay
(gauge)
Milliseconds of delay between the scheduled DagRun start date and the actual DagRun start date. Available in both Airflow v1 and v2.
Shown as millisecond
aws.mwaa.exception_failures
(count)
Number of failures caused by exception in the previous smart sensor poking loop. Only available in Airflow v2.
Shown as unit
aws.mwaa.failed_slaemail_attempts
(count)
Number of failed SLA miss email notification attempts. Only available in Airflow v2.
Shown as unit
aws.mwaa.first_task_scheduling_delay
(gauge)
Milliseconds elapsed between first task start_date and dagrun expected start. Only available in Airflow v2.
Shown as millisecond
aws.mwaa.import_errors
(count)
Number of errors from trying to parse DAG files. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.infra_failures
(count)
Number of infrastructure failures in the previous smart sensor poking loop. Only available in Airflow v2.
Shown as unit
aws.mwaa.job_end
(count)
Number of ended job, ex. SchedulerJob, LocalTaskJob. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.job_heartbeat_failure
(count)
Number of failed Heartbeats for a job, ex. SchedulerJob, LocalTaskJob. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.job_start
(count)
Number of started job, ex. SchedulerJob, LocalTaskJob. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.manager_stalls
(count)
Number of stalled DagFileProcessorManager. Only available in Airflow v2.
Shown as unit
aws.mwaa.open_slots
(count)
Number of open slots on executor. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.operator_failures
(count)
Operator failures. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.operator_successes
(count)
Operator successes. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.orphaned_tasks_adopted
(count)
Number of Orphaned tasks adopted by the Scheduler. Only available in Airflow v2.
Shown as unit
aws.mwaa.orphaned_tasks_cleared
(count)
Number of Orphaned tasks cleared by the Scheduler. Only available in Airflow v2.
Shown as unit
aws.mwaa.poked_exceptions
(count)
Number of exceptions in the previous smart sensor poking loop. Only available in Airflow v2.
Shown as unit
aws.mwaa.poked_success
(count)
Number of newly succeeded tasks poked by the smart sensor in the previous poking loop. Only available in Airflow v2.
Shown as unit
aws.mwaa.poked_tasks
(count)
Number of tasks poked by the smart sensor in the previous poking loop. Only available in Airflow v2.
Shown as unit
aws.mwaa.pool_open_slots
(count)
Number of open slots in the pool. Only available in Airflow v2.
Shown as unit
aws.mwaa.pool_queued_slots
(count)
Number of queued slots in the pool. Only available in Airflow v2.
Shown as unit
aws.mwaa.pool_running_slots
(count)
Number of running slots in the pool. Only available in Airflow v2.
Shown as unit
aws.mwaa.pool_starving_tasks
(count)
Number of starving tasks in the pool. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.pool_used_slots
(count)
Number of used slots in the pool. Only available in Airflow v1.
Shown as unit
aws.mwaa.processor_timeouts
(count)
Number of file processors that have been killed due to taking too long. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.queued_tasks
(count)
Sum number of queued tasks on executor. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.queued_tasks.average
(gauge)
Average number of queued tasks on executor. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.queued_tasks.max
(gauge)
Max number of queued tasks on executor. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.queued_tasks.min
(gauge)
Min number of queued tasks on executor. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.running_tasks
(count)
Sum number of running tasks on executor. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.running_tasks.average
(gauge)
Average number of running tasks on executor. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.running_tasks.max
(gauge)
Max number of running tasks on executor. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.running_tasks.min
(gauge)
Min number of running tasks on executor. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.scheduler_heartbeat
(count)
Scheduler heartbeats. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.task_instance_created_using_operator
(count)
Number of tasks instances created for a given Operator. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.task_instance_duration
(gauge)
Milliseconds taken to finish a task. Available in both Airflow v1 and v2.
Shown as millisecond
aws.mwaa.task_instance_failures
(count)
Overall task instances failures. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.task_instance_finished
(count)
Number of completed task in a given dag. Similar to _end but for task. Only available in Airflow v2.
Shown as unit
aws.mwaa.task_instance_previously_succeeded
(count)
Number of previously succeeded task instances. Only available in Airflow v2.
Shown as unit
aws.mwaa.task_instance_started
(count)
Number of started task in a given dag. Similar to _start but for task. Only available in Airflow v2.
Shown as unit
aws.mwaa.task_instance_successes
(count)
Overall task instances successes. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.task_removed_from_dag
(count)
Number of tasks removed for a given dag (i.e. task no longer exists in DAG). Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.task_restored_to_dag
(count)
Number of tasks restored for a given dag (i.e. task instance which was previously in REMOVED state in the DB is added to DAG file). Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.task_timeout_error
(count)
Number of AirflowTaskTimeout errors raised when publishing Task to Celery Broker. Only available in Airflow v2.
Shown as unit
aws.mwaa.tasks_executable
(count)
Sum number of tasks that are ready for execution (set to queued) with respect to pool limits, DAG concurrency, executor state, and priority. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_executable.average
(gauge)
Average number of tasks that are ready for execution (set to queued) with respect to pool limits, DAG concurrency, executor state, and priority. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_executable.max
(gauge)
Max number of tasks that are ready for execution (set to queued) with respect to pool limits, DAG concurrency, executor state, and priority. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_executable.min
(gauge)
Min number of tasks that are ready for execution (set to queued) with respect to pool limits, DAG concurrency, executor state, and priority. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_killed_externally
(count)
Sum number of tasks killed externally. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_killed_externally.average
(count)
Average number of tasks killed externally. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_killed_externally.max
(count)
Max number of tasks killed externally. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_killed_externally.min
(count)
Min number of tasks killed externally. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_pending
(count)
Sum number of tasks pending. Available in Airflow v1.
Shown as unit
aws.mwaa.tasks_pending.average
(gauge)
Average number of tasks pending. Available in both Airflow v1.
Shown as unit
aws.mwaa.tasks_pending.max
(gauge)
Max number of tasks pending. Available in both Airflow v1.
Shown as unit
aws.mwaa.tasks_pending.min
(gauge)
Min number of tasks pending. Available in both Airflow v1.
Shown as unit
aws.mwaa.tasks_running
(count)
Sum number of tasks running in executor. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_running.average
(gauge)
Average number of tasks running in executor. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_running.max
(gauge)
Max number of tasks running in executor. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_running.min
(gauge)
Min number of tasks running in executor. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_starving
(count)
Sum number of tasks that cannot be scheduled because of no open slot in pool. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_starving.average
(gauge)
Average number of tasks that cannot be scheduled because of no open slot in pool. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_starving.max
(gauge)
Max number of tasks that cannot be scheduled because of no open slot in pool. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_starving.min
(gauge)
Min number of tasks that cannot be scheduled because of no open slot in pool. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.tasks_without_dag_run
(count)
Number of tasks without DagRuns or with DagRuns not in Running state. Available in both Airflow v1 and v2.
Shown as unit
aws.mwaa.total_parse_time
(gauge)
Average seconds taken to scan and import all DAG files once. Available in both Airflow v1 and v2.
Shown as second
aws.mwaa.total_parse_time.maximum
(gauge)
Maximum seconds taken to scan and import all DAG files once. Available in both Airflow v1 and v2.
Shown as second
aws.mwaa.total_parse_time.minimum
(gauge)
Minimum seconds taken to scan and import all DAG files once. Available in both Airflow v1 and v2.
Shown as second
aws.mwaa.zombies_killed
(count)
Zombie tasks killed. Available in both Airflow v1 and v2.
Shown as unit

Eventos

La integración de Amazon Managed Workflows para Apache Airflow (MWAA) no incluye ningún evento.

Checks de servicio

La integración de Amazon Managed Workflows para Apache Airflow (MWAA) no incluye ningún check de servicio.

Resolución de problemas

¿Necesitas ayuda? Ponte en contacto con el soporte de Datadog.