AWS Glue

개요

AWS Glue는 완전 관리형 ETL(추출, 변형 및 로드) 서비스입니다. 데이터 분류, 정리, 확대를 비롯해 다양한 데이터 저장소 간 안정적인 이동을 단순하고 비용 효율적으로 만들어 줍니다.

이 통합을 활성화해 Datadog에서 모든 Glue 메트릭을 참조하세요.

설정

설치

이미 하지 않은 경우 먼저 Amazon Web Services 통합을 설정하세요.

메트릭 수집

  1. AWS 통합 페이지에서 Metric Collection 탭 아래 Glue가 활성화되어 있는지 확인합니다.
  2. Datadog - AWS Glue 통합를 설치합니다.

로그 수집

로깅 활성화

AWS Glue를 설정하여 S3 버킷 또는 클라우드와치(CloudWatch)에 로그를 전송합니다.

참고: S3 버킷에 로깅하면 amazon_glue가 _대상 접두어_로 설정되어 있는지 확인하세요.

Datadog에 로그 전송

  1. 이미 하지 않은 경우 Datadog 포워더 람다 함수를 설정하세요.

  2. 람다 함수가 설치되면 AWS 콘솔에서 AWS Glue 로그를 포함하는 S3 버킷 또는 클라우드와치(CloudWatch) 로그 그룹에 수동으로 트리거를 추가하세요.

수집한 데이터

메트릭

aws.glue.driver.executor_allocation_manager.executors.number_all_executors
(gauge)
The number of actively running job executors.
aws.glue.driver.executor_allocation_manager.executors.number_max_needed_executors
(gauge)
The number of maximum (actively running and pending) job executors needed to satisfy the current load.
aws.glue.glue_alljvm_heap_usage
(gauge)
The average fraction of memory used by the JVM heap for this driver (scale: 0-1) for all executors.
Shown as percent
aws.glue.glue_alljvm_heap_used
(gauge)
The number of memory bytes used by the JVM heap for all executors.
Shown as byte
aws.glue.glue_alls_3filesystem_readbytes
(gauge)
The average number of bytes read from Amazon S3 all executors since the previous report.
aws.glue.glue_allsystem_cpu_system_load
(gauge)
The average fraction of CPU system load used (scale: 0-1) by all executors.
Shown as percent
aws.glue.glue_driver_aggregate_bytes_read
(count)
The number of bytes read from all data sources by all completed Spark tasks running in all executors.
Shown as byte
aws.glue.glue_driver_aggregate_elapsed_time
(count)
The ETL elapsed time in milliseconds (does not include the job bootstrap times).
Shown as millisecond
aws.glue.glue_driver_aggregate_num_completed_stages
(count)
The number of completed stages in the job.
aws.glue.glue_driver_aggregate_num_completed_tasks
(count)
The number of completed tasks in the job.
aws.glue.glue_driver_aggregate_num_failed_tasks
(count)
The number of failed tasks.
aws.glue.glue_driver_aggregate_num_killed_tasks
(count)
The number of tasks killed.
aws.glue.glue_driver_aggregate_records_read
(count)
The number of records read from all data sources by all completed Spark tasks running in all executors.
aws.glue.glue_driver_aggregate_shuffle_bytes_written
(count)
The number of bytes written by all executors to shuffle data between them since the previous report.
aws.glue.glue_driver_aggregate_shuffle_local_bytes_read
(count)
The number of bytes read by all executors to shuffle data between them since the previous report.
aws.glue.glue_driver_block_manager_disk_disk_space_used_mb
(gauge)
The average number of megabytes of disk spaced used across all executors.
aws.glue.glue_driver_jvm_heap_usage
(gauge)
The average fraction of memory used by the JVM heap for this driver (scale: 0-1) for driver.
Shown as percent
aws.glue.glue_driver_jvm_heap_used
(gauge)
The number of memory bytes used by the JVM heap for the driver.
Shown as byte
aws.glue.glue_driver_s3_filesystem_readbytes
(gauge)
The average number of bytes read from Amazon S3 by the driver since the previous report.
aws.glue.glue_driver_s3_filesystem_writebytes
(gauge)
The average number of bytes written to Amazon S3 by the driver since the previous report.
aws.glue.glue_driver_system_cpu_system_load
(gauge)
The average fraction of CPU system load used (scale: 0-1) by the driver.
Shown as percent
aws.glue.glue_executor_id_jvm_heap_usage
(gauge)
The average fraction of memory used by the JVM heap for this driver (scale: 0-1) for executor identified.
Shown as percent
aws.glue.glue_executor_id_jvm_heap_used
(gauge)
The number of memory bytes used by the JVM heap for the executor identified.
Shown as byte
aws.glue.glue_executor_id_system_cpu_system_load
(gauge)
The average fraction of CPU system load used (scale: 0-1) by the executor identified.
Shown as percent
aws.glue.glue_executor_ids_3_filesystem_readbytes
(gauge)
The average number of bytes read from Amazon S3 by the executor identified since the previous report.
aws.glue.glue_executor_ids_3_filesystem_writebytes
(gauge)
The average number of bytes written to Amazon S3 by the executor identified since the previous report.

이벤트

AWS Glue 통합에는 이벤트가 포함되어 있지 않습니다.

서비스 점검

AWS Glue에는 서비스 점검이 포함되어 있지 않습니다.

트러블슈팅

도움이 필요하신가요? Datadog 지원팀에 문의하세요.