概要

AWS Glue は、シンプルかつコスト効率よくデータを分類、クリーニング、補完したり、さまざまなデータストア間のデータ移動を高い信頼性で行うことができるフルマネージド型 ETL (抽出、変換、ロード) サービスです。

このインテグレーションを有効にすると、Datadog にすべての Glue メトリクスを表示できます。

セットアップ

インストール

Amazon Web Services インテグレーションをまだセットアップしていない場合は、最初にセットアップします。

メトリクスの収集

  1. AWS インテグレーションページで、Metric Collection タブの下にある Glue が有効になっていることを確認します。
  2. Datadog - AWS Glue インテグレーションをインストールします。

収集データ

ログの有効化

AWS Glue から S3 バケットまたは CloudWatch のいずれかにログを送信するよう構成します。

: S3 バケットにログを送る場合は、Target prefixamazon_glue に設定されているかを確認してください。

ログを Datadog に送信する方法

  1. Datadog Forwarder Lambda 関数をまだセットアップしていない場合は、セットアップします。

  2. Lambda 関数がインストールされたら、AWS コンソールから、AWS Glue ログを含む S3 バケットまたは CloudWatch のロググループに手動でトリガーを追加します。

収集データ

メトリクス

aws.glue.driver.executor_allocation_manager.executors.number_all_executors
(gauge)
The number of actively running job executors.
aws.glue.driver.executor_allocation_manager.executors.number_max_needed_executors
(gauge)
The number of maximum (actively running and pending) job executors needed to satisfy the current load.
aws.glue.glue_alljvm_heap_usage
(gauge)
The average fraction of memory used by the JVM heap for this driver (scale: 0-1) for all executors.
Shown as percent
aws.glue.glue_alljvm_heap_used
(gauge)
The number of memory bytes used by the JVM heap for all executors.
Shown as byte
aws.glue.glue_alls_3filesystem_readbytes
(gauge)
The average number of bytes read from Amazon S3 all executors since the previous report.
aws.glue.glue_allsystem_cpu_system_load
(gauge)
The average fraction of CPU system load used (scale: 0-1) by all executors.
Shown as percent
aws.glue.glue_driver_aggregate_bytes_read
(count)
The number of bytes read from all data sources by all completed Spark tasks running in all executors.
Shown as byte
aws.glue.glue_driver_aggregate_elapsed_time
(count)
The ETL elapsed time in milliseconds (does not include the job bootstrap times).
Shown as millisecond
aws.glue.glue_driver_aggregate_num_completed_stages
(count)
The number of completed stages in the job.
aws.glue.glue_driver_aggregate_num_completed_tasks
(count)
The number of completed tasks in the job.
aws.glue.glue_driver_aggregate_num_failed_tasks
(count)
The number of failed tasks.
aws.glue.glue_driver_aggregate_num_killed_tasks
(count)
The number of tasks killed.
aws.glue.glue_driver_aggregate_records_read
(count)
The number of records read from all data sources by all completed Spark tasks running in all executors.
aws.glue.glue_driver_aggregate_shuffle_bytes_written
(count)
The number of bytes written by all executors to shuffle data between them since the previous report.
aws.glue.glue_driver_aggregate_shuffle_local_bytes_read
(count)
The number of bytes read by all executors to shuffle data between them since the previous report.
aws.glue.glue_driver_block_manager_disk_disk_space_used_mb
(gauge)
The average number of megabytes of disk spaced used across all executors.
aws.glue.glue_driver_jvm_heap_usage
(gauge)
The average fraction of memory used by the JVM heap for this driver (scale: 0-1) for driver.
Shown as percent
aws.glue.glue_driver_jvm_heap_used
(gauge)
The number of memory bytes used by the JVM heap for the driver.
Shown as byte
aws.glue.glue_driver_s3_filesystem_readbytes
(gauge)
The average number of bytes read from Amazon S3 by the driver since the previous report.
aws.glue.glue_driver_s3_filesystem_writebytes
(gauge)
The average number of bytes written to Amazon S3 by the driver since the previous report.
aws.glue.glue_driver_system_cpu_system_load
(gauge)
The average fraction of CPU system load used (scale: 0-1) by the driver.
Shown as percent
aws.glue.glue_executor_id_jvm_heap_usage
(gauge)
The average fraction of memory used by the JVM heap for this driver (scale: 0-1) for executor identified.
Shown as percent
aws.glue.glue_executor_id_jvm_heap_used
(gauge)
The number of memory bytes used by the JVM heap for the executor identified.
Shown as byte
aws.glue.glue_executor_id_system_cpu_system_load
(gauge)
The average fraction of CPU system load used (scale: 0-1) by the executor identified.
Shown as percent
aws.glue.glue_executor_ids_3_filesystem_readbytes
(gauge)
The average number of bytes read from Amazon S3 by the executor identified since the previous report.
aws.glue.glue_executor_ids_3_filesystem_writebytes
(gauge)
The average number of bytes written to Amazon S3 by the executor identified since the previous report.

イベント

AWS Glue インテグレーションには、イベントは含まれません。

サービスチェック

AWS Glue インテグレーションには、サービスのチェック機能は含まれません。

トラブルシューティング

ご不明な点は、Datadog のサポートチームまでお問合せください。