- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Google Cloud Dataflow는 스트림(실시간) 및 배치(기록) 모드에서 동일한 안정성과 표현 능력으로 데이터를 변환 및 보강할 수 있는 완전관리형 서비스입니다.
Datadog Google Cloud 통합을 사용하여 Google Cloud Dataflow에서 메트릭을 수집합니다.
아직 설치하지 않았다면 먼저 Google 클라우드 플랫폼 통합을 설정합니다. 그 외 다른 설치가 필요하지 않습니다.
Google Cloud Dataflow 로그는 Google Cloud Logging으로 수집하여 클라우드 Pub/Sub 토픽을 통해 데이터 플로우 작업으로 전송됩니다. 아직 설정하지 않았다면 Datadog 데이터 플로우 템플릿으로 로깅을 설정하세요.
해당 작업이 완료되면 Google Cloud Logging에서 Google Cloud Dataflow 로그를 다음 Pub/Sub 주제로 내보냅니다.
gcp.dataflow.job.active_worker_instances (gauge) | The active number of worker instances. |
gcp.dataflow.job.aggregated_worker_utilization (gauge) | Aggregated worker utilization (for example, worker CPU utilization) across the worker pool. Shown as percent |
gcp.dataflow.job.backlog_bytes (gauge) | Amount of known, unprocessed input for a stage, in bytes. Shown as byte |
gcp.dataflow.job.backlog_elements (gauge) | Amount of known, unprocessed input for a stage, in elements. |
gcp.dataflow.job.bigquery.write_count (count) | BigQuery write requests from BigQueryIO.Write in Dataflow jobs. |
gcp.dataflow.job.billable_shuffle_data_processed (gauge) | The billable bytes of shuffle data processed by this Dataflow job. Shown as byte |
gcp.dataflow.job.bundle_user_processing_latencies.avg (gauge) | The average bundle user processing latencies from a particular stage. Available for jobs running on Streaming Engine. Shown as millisecond |
gcp.dataflow.job.bundle_user_processing_latencies.samplecount (gauge) | The sample count for bundle user processing latencies from a particular stage. Available for jobs running on Streaming Engine. Shown as millisecond |
gcp.dataflow.job.bundle_user_processing_latencies.sumsqdev (gauge) | The sum of squared deviation for bundle user processing latencies from a particular stage. Available for jobs running on Streaming Engine. Shown as millisecond |
gcp.dataflow.job.current_num_vcpus (gauge) | The number of vCPUs currently being used by this Dataflow job. Shown as cpu |
gcp.dataflow.job.current_shuffle_slots (gauge) | The current shuffle slots used by this Dataflow job. |
gcp.dataflow.job.data_watermark_age (gauge) | The age (time since event timestamp) of the most recent item of data that has been fully processed by the pipeline. Shown as second |
gcp.dataflow.job.disk_space_capacity (gauge) | The amount of persistent disk currently being allocated to all workers associated with this Dataflow job. Shown as byte |
gcp.dataflow.job.dofn_latency_average (gauge) | The average processing time for a single message in a given DoFn (over the past 3 min window). Note that this includes time spent in GetData calls. Shown as millisecond |
gcp.dataflow.job.dofn_latency_max (gauge) | The maximum processing time for a single message in a given DoFn (over the past 3 min window). Note that this includes time spent in GetData calls. Shown as millisecond |
gcp.dataflow.job.dofn_latency_min (gauge) | The minimum processing time for a single message in a given DoFn (over the past 3 min window). Shown as millisecond |
gcp.dataflow.job.dofn_latency_num_messages (gauge) | The number of messages processed by a given DoFn (over the past 3 min window). |
gcp.dataflow.job.dofn_latency_total (gauge) | The total processing time for all messages in a given DoFn (over the past 3 min window). Note that this includes time spent in GetData calls. Shown as millisecond |
gcp.dataflow.job.duplicates_filtered_out_count (count) | The number of messages being processed by a particular stage that have been filtered out as duplicates. Available for jobs running on Streaming Engine. |
gcp.dataflow.job.elapsed_time (gauge) | Duration that the current run of this pipeline has been in the Running state so far, in seconds. When a run completes, this stays at the duration of that run until the next run starts. Shown as second |
gcp.dataflow.job.element_count (count) | Number of elements added to the PCollection so far. Shown as item |
gcp.dataflow.job.elements_produced_count (count) | The number of elements produced by each PTransform. |
gcp.dataflow.job.estimated_backlog_processing_time (gauge) | Estimated time (in seconds) to consume current backlog if no new data comes in and throughput stays the same. Only available for Streaming Engine jobs. Shown as second |
gcp.dataflow.job.estimated_byte_count (count) | An estimated number of bytes added to the PCollection so far. Shown as byte |
gcp.dataflow.job.estimated_bytes_active (gauge) | Estimated number of bytes active in this stage of the job. Shown as byte |
gcp.dataflow.job.estimated_bytes_consumed_count (count) | Estimated number of bytes consumed by the stage of this job. Shown as byte |
gcp.dataflow.job.estimated_bytes_produced_count (count) | The estimated total byte size of elements produced by each PTransform. |
gcp.dataflow.job.estimated_timer_backlog_processing_time (gauge) | Estimated time (in seconds) for timers to complete. Only available for Streaming Engine jobs. Shown as second |
gcp.dataflow.job.gpu_memory_utilization (gauge) | Percent of time over the past sample period during which global (device) memory was being read or written. Shown as percent |
gcp.dataflow.job.gpu_utilization (gauge) | Percent of time over the past sample period during which one or more kernels was executing on the GPU. Shown as percent |
gcp.dataflow.job.horizontal_worker_scaling (gauge) | A boolean value that indicates what kind of horizontal scaling direction which the autoscaler recommended and rationale behind it. A true metric output means a scaling decision is made and a false metric output means the corresponding scaling is not taking effect. |
gcp.dataflow.job.is_failed (gauge) | Has this job failed. |
gcp.dataflow.job.max_worker_instances_limit (gauge) | The maximum number of workers autoscaling is allowed to request. |
gcp.dataflow.job.memory_capacity (gauge) | The amount of memory currently being allocated to all workers associated with this Dataflow job. Shown as byte |
gcp.dataflow.job.min_worker_instances_limit (gauge) | The minimum number of workers autoscaling is allowed to request. |
gcp.dataflow.job.oldest_active_message_age (gauge) | How long the oldest active message in a DoFn has been processing for. Shown as millisecond |
gcp.dataflow.job.per_stage_data_watermark_age (gauge) | The age (time since event timestamp) up to which all data has been processed by this stage of the pipeline. Shown as second |
gcp.dataflow.job.per_stage_system_lag (gauge) | The current maximum duration that an item of data has been processing or awaiting processing in seconds, per pipeline stage. Shown as second |
gcp.dataflow.job.processing_parallelism_keys (gauge) | Approximate number of keys in use for data processing for each stage. Processing for any given key is serialized, so the total number of keys for a stage represents the maximum available parallelism at that stage. Available for jobs running on Streaming Engine. |
gcp.dataflow.job.pubsub.late_messages_count (count) | The number of messages from Pub/Sub with timestamp older than the estimated watermark. |
gcp.dataflow.job.pubsub.published_messages_count (count) | The number of Pub/Sub messages published broken down by topic and status. |
gcp.dataflow.job.pubsub.pulled_message_ages.avg (gauge) | The average the distribution of pulled but unacked Pub/Sub message ages. Shown as millisecond |
gcp.dataflow.job.pubsub.pulled_message_ages.samplecount (gauge) | The sample count for the distribution of pulled but unacked Pub/Sub message ages. Shown as millisecond |
gcp.dataflow.job.pubsub.pulled_message_ages.sumsqdev (gauge) | The sum of squared deviation for the distribution of pulled but unacked Pub/Sub message ages. Shown as millisecond |
gcp.dataflow.job.pubsub.read_count (count) | Pub/Sub Pull Requests. For Streaming Engine, this metric is deprecated. See the Using the Dataflow monitoring interface page for upcoming changes. |
gcp.dataflow.job.pubsub.read_latencies.avg (count) | The average Pub/Sub Pull request latencies from PubsubIO.Read in Dataflow jobs. For Streaming Engine, this metric is deprecated. See the Using the Dataflow monitoring interface page for upcoming changes. Shown as millisecond |
gcp.dataflow.job.pubsub.read_latencies.samplecount (count) | The sample count for Pub/Sub Pull request latencies from PubsubIO.Read in Dataflow jobs. For Streaming Engine, this metric is deprecated. See the Using the Dataflow monitoring interface page for upcoming changes. Shown as millisecond |
gcp.dataflow.job.pubsub.read_latencies.sumsqdev (count) | The sum of squared deviation for Pub/Sub Pull request latencies from PubsubIO.Read in Dataflow jobs. For Streaming Engine, this metric is deprecated. See the Using the Dataflow monitoring interface page for upcoming changes. Shown as millisecond |
gcp.dataflow.job.pubsub.streaming_pull_connection_status (gauge) | Percentage of all Streaming Pull connections that are either active (OK status) or terminated because of an error (non-OK status). When a connection is terminated, Dataflow will wait some time before attempting to re-connect. For Streaming Engine only. Shown as percent |
gcp.dataflow.job.pubsub.write_count (count) | Pub/Sub Publish requests from PubsubIO.Write in Dataflow jobs. |
gcp.dataflow.job.pubsub.write_latencies.avg (count) | The average Pub/Sub Publish request latencies from PubsubIO.Write in Dataflow jobs. Shown as millisecond |
gcp.dataflow.job.pubsub.write_latencies.samplecount (count) | The sample count for Pub/Sub Publish request latencies from PubsubIO.Write in Dataflow jobs. Shown as millisecond |
gcp.dataflow.job.pubsub.write_latencies.sumsqdev (count) | The sum of squared deviation for Pub/Sub Publish request latencies from PubsubIO.Write in Dataflow jobs. Shown as millisecond |
gcp.dataflow.job.status (gauge) | Current state of this pipeline (for example, RUNNING, DONE, CANCELLED, or FAILED). Not reported while the pipeline is not running. |
gcp.dataflow.job.streaming_engine.key_processing_availability (gauge) | Percentage of streaming processing keys that are assigned to workers and available to perform work. Work for unavailable keys will be deferred until keys are available. |
gcp.dataflow.job.streaming_engine.persistent_state.read_bytes_count (count) | Storage bytes read by a particular stage. Available for jobs running on Streaming Engine. |
gcp.dataflow.job.streaming_engine.persistent_state.stored_bytes (gauge) | Current bytes stored in persistent state for the job. Shown as byte |
gcp.dataflow.job.streaming_engine.persistent_state.write_bytes_count (count) | Storage bytes written by a particular stage. Available for jobs running on Streaming Engine. |
gcp.dataflow.job.streaming_engine.persistent_state.write_latencies.avg (count) | The average storage write latencies from a particular stage. Available for jobs running on Streaming Engine. Shown as millisecond |
gcp.dataflow.job.streaming_engine.persistent_state.write_latencies.samplecount (count) | The sample count for storage write latencies from a particular stage. Available for jobs running on Streaming Engine. Shown as millisecond |
gcp.dataflow.job.streaming_engine.persistent_state.write_latencies.sumsqdev (count) | The sum of squared deviation for storage write latencies from a particular stage. Available for jobs running on Streaming Engine. Shown as millisecond |
gcp.dataflow.job.streaming_engine.stage_end_to_end_latencies.avg (gauge) | The average distribution of time spent by streaming engine in each stage of the pipeline. This time includes shuffling messages, queueing them for processing, processing, queueing for persistent state write, and the write itself. Shown as millisecond |
gcp.dataflow.job.streaming_engine.stage_end_to_end_latencies.samplecount (gauge) | The sample count for distribution of time spent by streaming engine in each stage of the pipeline. This time includes shuffling messages, queueing them for processing, processing, queueing for persistent state write, and the write itself. Shown as millisecond |
gcp.dataflow.job.streaming_engine.stage_end_to_end_latencies.sumsqdev (gauge) | The sum of squared deviation for distribution of time spent by streaming engine in each stage of the pipeline. This time includes shuffling messages, queueing them for processing, processing, queueing for persistent state write, and the write itself. Shown as millisecond |
gcp.dataflow.job.system_lag (gauge) | The current maximum duration that an item of data has been awaiting processing, in seconds. Shown as second |
gcp.dataflow.job.target_worker_instances (gauge) | The desired number of worker instances. |
gcp.dataflow.job.thread_time (count) | Estimated time in milliseconds spent running in the function of the ptransform totaled across threads on all workers of the job. Shown as millisecond |
gcp.dataflow.job.timers_pending_count (count) | The number of timers pending in a particular stage. Available for jobs running on Streaming Engine. |
gcp.dataflow.job.timers_processed_count (count) | The number of timers completed by a particular stage. Available for jobs running on Streaming Engine. |
gcp.dataflow.job.total_dcu_usage (count) | Cumulative amount of DCUs (Data Compute Unit) used by the Dataflow job since it was launched. |
gcp.dataflow.job.total_memory_usage_time (gauge) | The total GB seconds of memory allocated to this Dataflow job. Shown as gibibyte |
gcp.dataflow.job.total_pd_usage_time (gauge) | The total GB seconds for all persistent disk used by all workers associated with this Dataflow job. Shown as gibibyte |
gcp.dataflow.job.total_secu_usage (gauge) | The total amount of SECUs (Streaming Engine Compute Unit) used by the Dataflow job since it was launched. |
gcp.dataflow.job.total_shuffle_data_processed (gauge) | The total bytes of shuffle data processed by this Dataflow job. Shown as byte |
gcp.dataflow.job.total_streaming_data_processed (gauge) | The total bytes of streaming data processed by this Dataflow job. Shown as byte |
gcp.dataflow.job.total_vcpu_time (gauge) | The total vCPU seconds used by this Dataflow job. |
gcp.dataflow.job.user_counter (gauge) | A user-defined counter metric. |
gcp.dataflow.job.worker_utilization_hint (gauge) | Worker utilization hint for autoscaling. This hint value is configured by the customers and defines a target worker CPU utilization range thus influencing scaling aggressiveness. Shown as percent |
gcp.dataflow.job.worker_utilization_hint_is_actively_used (gauge) | Reports whether or not the worker utilization hint is actively used by the horizontal autoscaling policy. |
gcp.dataflow.quota.region_endpoint_shuffle_slot.exceeded (count) | Number of attempts to exceed the limit on quota metric dataflow.googleapis.com/region_endpoint_shuffle_slot . |
gcp.dataflow.quota.region_endpoint_shuffle_slot.limit (gauge) | Current limit on quota metric dataflow.googleapis.com/region_endpoint_shuffle_slot . |
gcp.dataflow.quota.region_endpoint_shuffle_slot.usage (gauge) | Current usage on quota metric dataflow.googleapis.com/region_endpoint_shuffle_slot . |
gcp.dataflow.worker.memory.bytes_used (gauge) | The memory usage in bytes by a particular container instance on a Dataflow worker. Shown as byte |
gcp.dataflow.worker.memory.container_limit (gauge) | Maximum RAM size in bytes available to a particular container instance on a Dataflow worker. Shown as byte |
gcp.dataflow.worker.memory.total_limit (gauge) | RAM size in bytes available to a Dataflow worker. Shown as byte |
Google Cloud Dataflow 통합은 이벤트를 포함하지 않습니다.
Google Cloud Dataflow 통합은 서비스 점검을 포함하지 않습니다.
도움이 필요하신가요? Datadog 지원팀에 문의하세요.
추가 유용한 문서, 링크 및 기사: