Presto

Supported OS Mac OS Windows

통합 버전3.1.0

개요

이 점검은 다음과 같은 Presto 메트릭을 수집합니다.

  • 전반적인 활동 메트릭: 완료/실패한 쿼리, 데이터 입력/출력 크기, 실행 시간.
  • 성능 메트릭: 클러스터 메모리, 입력 CPU, 실행 CPU 시간.

설정

아래 지침을 따라 호스트에서 실행되는 에이전트에 대해 이 점검을 설치하고 설정하세요. 컨테이너화된 환경의 경우 이러한 지침을 적용하는 데 가이드가 필요하면 오토파일럿 통합 템플릿을 참조하세요.

설치

Presto 점검은 Datadog Agent 패키지에 포함되어 있습니다. 서버에 추가 설치가 필요하지 않습니다. 사용 및 성능 메트릭을 수집하려는 각 Coordinator 및 Worker 노드에 Agent 를 설치합니다.

구성

  1. Agent의 구성 디렉토리 루트에 있는 conf.d/ 폴더에서 presto.d/conf.yaml 파일을 편집하여 Presto 성능 데이터 수집을 시작합니다. 사용 가능한 모든 구성 옵션은 샘플 presto.d/conf.yaml을 참조하세요.

    이 점검에는 인스턴스당 350개의 메트릭이 제한됩니다. 반환된 메트릭의 수는 상태 페이지에 표시됩니다. 아래 구성을 편집하여 관심 있는 메트릭을 지정할 수 있습니다. 수집할 메트릭을 커스터마이징하려면 JMX 점검 설명서에서 자세한 지침을 확인해 보세요. 더 많은 메트릭을 모니터링해야 하는 경우 Datadog 지원팀에 문의하세요.

  2. Agent를 재시작합니다.

메트릭 수집

Presto 메트릭 수집을 활성화하려면 presto.d/conf.yaml 파일의 기본 구성을 사용합니다. 사용 가능한 모든 구성 옵션은 샘플 presto.d/conf.yaml을 참조하세요.

로그 수집

Agent 버전 6.0 이상에서 사용 가능

  1. Datadog Agent에서 로그 수집은 기본적으로 비활성화되어 있으므로 datadog.yaml 파일에서 활성화합니다.

    logs_enabled: true
    
  2. Presto 로그 수집을 시작하려면 presto.d/conf.yaml 파일에 이 구성 블록을 추가합니다.

    logs:
      - type: file
        path: /var/log/presto/*.log
        source: presto
        service: "<SERVICE_NAME>"
    

    pathservice 파라미터 값을 변경하고 환경에 맞게 구성합니다. 사용 가능한 모든 구성 옵션은 샘플 presto.d/conf.yaml을 참조하세요.

  3. Agent를 재시작합니다.

검증

Agent의 상태 하위 명령을 실행하고 Checks 섹션에서 presto를 찾습니다.

수집한 데이터

메트릭

presto.execution.abandoned_queries.one_minute.count
(gauge)
Abandoned queries - one minute count.
Shown as query
presto.execution.abandoned_queries.one_minute.rate
(gauge)
Abandoned queries - one minute rate.
Shown as query
presto.execution.abandoned_queries.total_count
(gauge)
Abandoned queries - total count.
Shown as query
presto.execution.canceled_queries.one_minute.count
(gauge)
Canceled queries - one minute count.
Shown as query
presto.execution.canceled_queries.one_minute.rate
(gauge)
Canceled queries - one minute queries per second.
Shown as query
presto.execution.canceled_queries.total_count
(gauge)
Canceled queries - total count.
Shown as query
presto.execution.completed_queries.one_minute.count
(gauge)
Completed queries - one minute count.
Shown as query
presto.execution.completed_queries.one_minute.rate
(gauge)
Completed queries - one minute queries per second.
Shown as query
presto.execution.completed_queries.total_count
(gauge)
Completed queries - total count.
Shown as query
presto.execution.consumed_cpu_time_secs.one_minute.count
(gauge)
CPU (processing) time consumed - one minute count (seconds).
Shown as second
presto.execution.consumed_cpu_time_secs.one_minute.rate
(gauge)
CPU (processing) time consumed - one minute rate.
Shown as second
presto.execution.consumed_cpu_time_secs.total_count
(gauge)
CPU (processing) time consumed - total count (seconds).
Shown as second
presto.execution.cpu_input_byte_rate.all_time.avg
(gauge)
Distribution of query input data rates (cpu) - all time average bytes per second.
Shown as byte
presto.execution.cpu_input_byte_rate.all_time.p75
(gauge)
Distribution of query input data rates (cpu) - all time bytes per second - p75.
Shown as byte
presto.execution.cpu_input_byte_rate.all_time.p95
(gauge)
Distribution of query input data rates (cpu) - all time bytes per second - p95.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.avg
(gauge)
Distribution of query input data rates (cpu) - one minute average bytes per second.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.count
(gauge)
Distribution of query input data rates (cpu) - one minute count.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.max
(gauge)
Distribution of query input data rates (cpu) - one minute max bytes per second.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.min
(gauge)
Distribution of query input data rates (cpu) - one minute min bytes per second.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.p75
(gauge)
Distribution of query input data rates (cpu) - one minute bytes per second - p75.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.p95
(gauge)
Distribution of query input data rates (cpu) - one minute bytes per second - p95.
Shown as byte
presto.execution.cpu_input_byte_rate.one_minute.total
(gauge)
Distribution of query input data rates (cpu) - one minute total bytes per second.
Shown as byte
presto.execution.execution_time.all_time.avg
(gauge)
Query execution time (millisecond) - all time average.
Shown as millisecond
presto.execution.execution_time.all_time.count
(gauge)
Query execution time (millisecond) - all time count.
Shown as millisecond
presto.execution.execution_time.all_time.max
(gauge)
Query execution time (millisecond) - all time max.
Shown as millisecond
presto.execution.execution_time.all_time.min
(gauge)
Query execution time (millisecond) - all time min.
Shown as millisecond
presto.execution.execution_time.all_time.p75
(gauge)
Query execution time (millisecond) - all time - p75.
Shown as millisecond
presto.execution.execution_time.all_time.p95
(gauge)
Query execution time (millisecond) - all time - p95.
Shown as millisecond
presto.execution.execution_time.one_minute.avg
(gauge)
Query execution time (millisecond) - one minute average.
Shown as millisecond
presto.execution.execution_time.one_minute.max
(gauge)
Query execution time (millisecond) - one minute max.
Shown as millisecond
presto.execution.execution_time.one_minute.min
(gauge)
Query execution time (millisecond) - one minute min.
Shown as millisecond
presto.execution.execution_time.one_minute.p75
(gauge)
Query execution time (millisecond) - one minute p75.
Shown as millisecond
presto.execution.execution_time.one_minute.p95
(gauge)
Query execution time (millisecond) - one minute p95.
Shown as millisecond
presto.execution.executor.active_count
(gauge)
presto.execution.executor.blocked_splits
(gauge)
Blocked splits count.
Shown as split
presto.execution.executor.completed_task_count
(gauge)

Shown as task
presto.execution.executor.core_pool_size
(gauge)
presto.execution.executor.pool_size
(gauge)
presto.execution.executor.processor_executor.queued_task_count
(gauge)
Queued task count.
Shown as task
presto.execution.executor.queued_task_count
(gauge)
presto.execution.executor.running_splits
(gauge)
Running splits count.
Shown as split
presto.execution.executor.task_count
(gauge)

Shown as task
presto.execution.executor.total_splits
(gauge)
Total splits count.
Shown as split
presto.execution.executor.waiting_splits
(gauge)
Waiting splits count.
Shown as split
presto.execution.external_failures.one_minute.count
(gauge)
Failed queries (external) - one minute count.
Shown as query
presto.execution.external_failures.one_minute.rate
(gauge)
Failed queries (external) - one minute failures per second.
Shown as query
presto.execution.external_failures.total_count
(gauge)
Failed queries (external) - total count.
Shown as query
presto.execution.failed_queries.one_minute.count
(gauge)
Failed queries - one minute count.
Shown as query
presto.execution.failed_queries.one_minute.rate
(gauge)
Failed queries - one minute queries per second.
Shown as query
presto.execution.failed_queries.total_count
(gauge)
Failed queries - total count.
Shown as query
presto.execution.input_data_size.one_minute.count
(gauge)
Input data (bytes) - one minute count.
Shown as byte
presto.execution.input_data_size.one_minute.rate
(gauge)
Input data (bytes) - one minute bytes per second.
Shown as byte
presto.execution.input_data_size.total_count
(gauge)
Input data (bytes) - total count.
Shown as byte
presto.execution.input_positions.one_minute.count
(gauge)
Input positions (rows) - one minute count.
Shown as row
presto.execution.input_positions.one_minute.rate
(gauge)
Input positions (rows) - one minute rows per second.
Shown as row
presto.execution.input_positions.total_count
(gauge)
Input positions (rows) - total count.
Shown as row
presto.execution.insufficient_resources_failures.one_minute.count
(gauge)
Insufficient resources failures one minute count.
presto.execution.insufficient_resources_failures.one_minute.rate
(gauge)
Insufficient resources failures one minute failures per second.
presto.execution.insufficient_resources_failures.total_count
(gauge)
Insufficient resources failures total count.
presto.execution.internal_failures.one_minute.count
(gauge)
Failed queries (internal) - one minute count.
Shown as query
presto.execution.internal_failures.one_minute.rate
(gauge)
Failed queries (internal) - one minute queries per second.
Shown as query
presto.execution.internal_failures.total_count
(gauge)
Failed queries (internal) - total count.
Shown as query
presto.execution.management_executor.active_count
(gauge)
presto.execution.management_executor.completed_task_count
(gauge)

Shown as task
presto.execution.management_executor.queued_task_count
(gauge)

Shown as task
presto.execution.output_data_size.one_minute.count
(gauge)
Output data (bytes) - one minute count.
Shown as byte
presto.execution.output_data_size.one_minute.rate
(gauge)
Output data (bytes) - one minute bytes per second.
Shown as byte
presto.execution.output_data_size.total_count
(gauge)
Output data (bytes) - total count.
Shown as byte
presto.execution.output_positions.one_minute.count
(gauge)
Output positions (rows) - one minute count.
Shown as row
presto.execution.output_positions.one_minute.rate
(gauge)
Output positions (rows) - one minute rows per second.
Shown as row
presto.execution.output_positions.total_count
(gauge)
Output positions (rows) - total count.
Shown as row
presto.execution.running_queries
(gauge)
Active queries.
Shown as query
presto.execution.started_queries.one_minute.count
(gauge)
Queries started - one minute count.
Shown as query
presto.execution.started_queries.one_minute.rate
(gauge)
Queries started - one minute queries per second.
Shown as query
presto.execution.started_queries.total_count
(gauge)
Queries started - total count.
Shown as query
presto.execution.task_notification_executor.active_count
(gauge)
presto.execution.task_notification_executor.completed_task_count
(gauge)

Shown as task
presto.execution.task_notification_executor.pool_size
(gauge)
presto.execution.task_notification_executor.queued_task_count
(gauge)

Shown as task
presto.execution.user_error_failures.one_minute.count
(gauge)
Failed queries (user error) - one minute count.
Shown as query
presto.execution.user_error_failures.one_minute.rate
(gauge)
Failed queries (user error) - one minute queries per second.
Shown as query
presto.execution.user_error_failures.total_count
(gauge)
Failed queries (user error) - total count.
Shown as query
presto.execution.wall_input_bytes_rate.one_minute.avg
(gauge)
Input data rate (bytes) - one minute average.
Shown as byte
presto.execution.wall_input_bytes_rate.one_minute.max
(gauge)
Input data rate (bytes) - one minute max.
Shown as byte
presto.execution.wall_input_bytes_rate.one_minute.min
(gauge)
Input data rate (bytes) - one minute min.
Shown as byte
presto.execution.wall_input_bytes_rate.one_minute.p75
(gauge)
Input data rate (bytes) - one minute p75.
Shown as byte
presto.execution.wall_input_bytes_rate.one_minute.p95
(gauge)
Input data rate (bytes) - one minute p95.
Shown as byte
presto.failure_detector.active_count
(gauge)
Active node count.
Shown as node
presto.memory.assigned_queries
(gauge)
Memory (assigned queries).
Shown as byte
presto.memory.blocked_nodes
(gauge)
Memory (blocked nodes).
Shown as byte
presto.memory.cluster_memory_bytes
(gauge)
Cluster memory (bytes).
Shown as byte
presto.memory.free_bytes
(gauge)
Memory (free bytes).
Shown as byte
presto.memory.free_distributed_bytes
(gauge)
Memory (free distributed bytes).
Shown as byte
presto.memory.max_bytes
(gauge)
Memory (max bytes).
Shown as byte
presto.memory.nodes
(gauge)
Memory (nodes).
Shown as byte
presto.memory.reserved_bytes
(gauge)
Memory (reserved bytes).
Shown as byte
presto.memory.reserved_distributed_bytes
(gauge)
Memory (reserved distributed bytes).
Shown as byte
presto.memory.reserved_revocable_bytes
(gauge)
Memory (reserved revocable bytes).
Shown as byte
presto.memory.reserved_revocable_distributed_bytes
(gauge)
Memory (reserved revocable distributed bytes).
Shown as byte
presto.memory.total_distributed_bytes
(gauge)
Memory (total distributed bytes).
Shown as byte

이벤트

Presto는 이벤트를 포함하지 않습니다.

서비스 점검

presto.can_connect
Returns CRITICAL if the Agent is unable to connect to and collect metrics from the monitored Presto instance, WARNING if no metrics are collected, and OK otherwise.
Statuses: ok, critical, warning

트러블슈팅

도움이 필요하신가요? Datadog 지원팀에 문의하세요.