이 페이지는 아직 한국어로 제공되지 않습니다. 번역 작업 중입니다.
현재 번역 프로젝트에 대한 질문이나 피드백이 있으신 경우 언제든지 연락주시기 바랍니다.
Join the Preview!

Data Observability: Jobs Monitoring for AWS Glue is in Preview. To join the Preview, follow the steps on this page.

Overview

Data Observability: Jobs Monitoring gives visibility into the performance and reliability of your AWS Glue jobs.

Prerequisites

Before you begin, make sure you have:

  • An AWS account with Glue jobs you want to monitor.
  • The Datadog AWS integration configured for the account.
  • IAM permissions to modify the Datadog role’s policies.

Configure the AWS account

  1. Navigate to Datadog Data Observability > Settings.

  2. Click Configure next to AWS Glue.

    AWS Glue configuration option in the Data Observability Settings page
  3. Select an existing AWS account that is already connected to Datadog, or add a new one. For help adding a new account, see the AWS Integration documentation.

    AWS account selection dropdown in the configuration flow

Add required IAM permissions

The Data Observability crawler requires additional permissions to monitor Glue jobs. Attach the following policy to the Datadog IAM role configured for your AWS integration:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:GetCatalog",
        "glue:GetDatabase",
        "glue:GetDatabases",
        "glue:GetJobRun",
        "glue:GetJobRuns",
        "glue:GetJob",
        "glue:GetJobs",
        "glue:GetTable",
        "glue:GetTables",
        "glue:ListJobs",
        "s3:ListBucket",
        "kms:Decrypt",
        "lakeformation:GetDataAccess"
      ],
      "Resource": ["*"]
    },
    {
      "Sid": "AllowIcebergMetadataOnly",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:GetObjectVersion"
      ],
      "Resource": [
        "arn:aws:s3:::*/metadata/*"
      ]
    }
  ]
}

Some of these permissions are related to monitoring Iceberg tables in Glue. For more details on dataset-related IAM permissions, see the AWS Glue Data Quality Monitoring documentation.

Configure the crawler

  1. Select the AWS regions where your Glue jobs are located.

  2. Enable the Job Monitoring toggle.

    Crawler configuration showing region selection and sync frequency options
  3. Click Save.

(Optional) Configure Glue jobs logs

  1. Follow these steps to send AWS logs from CloudWatch to Datadog.

  2. Manually configure triggers in AWS CloudWatch to capture AWS Glue logs. By default, Glue logs are stored in the following log groups:

    • /aws-glue/jobs/error
    • /aws-glue/jobs/output
    • /aws-glue/jobs/logs-v2
  3. Note: After logs are ingested into Datadog, the CloudWatch log group name maps to the host attribute in Datadog Logs.

  4. Create a Log Index that includes logs where the host attribute matches:

    • /aws-glue/jobs/error
    • /aws-glue/jobs/output
    • /aws-glue/jobs/logs-v2

This helps ensure the logs are searchable and available under the Glue tab in Data Observability: Jobs Monitoring.

(Optional) Configure Glue metrics

Enable the Glue Integration tile for Glue metrics collection. Metrics should be available under the Glue job tab in Data Observability: Jobs Monitoring.

Next steps

The crawler runs every few minutes. In Datadog, view the Data Observability: Jobs Monitoring page to see a list of your Glue job runs after setup.

Further reading