flume

Supported OS Linux Mac OS Windows

통합 버전0.0.1

개요

이 점검은 Apache Flume을 모니터링합니다.

설정

Flume 점검은 Datadog 에이전트 패키지에 포함되어 있지 않기 때문에 별도로 설치해야 합니다.

설치

에이전트 v7.21+/v6.21+의 경우, 하단 지침에 따라 호스트에 따라 Flume 점검을 설치하세요. Docker 에이전트 또는 이전 버전의 에이전트와 같이 설치하려면 커뮤니티 통합을 참고하세요.

  1. 다음 명령어를 실행해 에이전트 통합을 설치하세요.

    datadog-agent integration install -t datadog-flume==<INTEGRATION_VERSION>
    
  2. 통합을 코어 통합과 유사하게 설정하세요.

구성

  1. 다음 JVM 인수를 flume-env.sh에 추가해 JMX를 활성화하도록 Flume을 구성합니다.
export JAVA_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=5445 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"
  1. 내 에이전트 구성 디렉터리의 루트 수준에 있는 conf.d/ 폴더에서 flume.d/conf.yaml 파일을 편집해 Flume 성능 데이터 수집을 시작하세요. 사용 가능한 모든 설정 옵션을 보려면 샘플 flume.d/conf.yaml을 참고하세요.

    이 점검의 제한 값은 인스턴스당 메트릭 350개입니다. 반환된 메트릭 개수는 상태 출력에 표시됩니다. 아래 구성을 편집해 관심 있는 메트릭을 지정할 수 있습니다. 수집할 메트릭을 맞춤화하는 방법에 관한 자세한 내용은 JMX 점검 설명서를 참고하세요. 더 많은 메트릭을 모니터링하고 싶을 경우에는 Datadog 지원팀에 문의하세요.

  2. 에이전트를 재시작합니다.

검증

에이전트의 status 하위 명령을 실행하고 Checks 섹션 아래의 flume을 찾으세요.

구성 요소 메트릭

이 점검으로 가져올 수 있는 메트릭은 Flume 에이전트가 사용하는 소스, 채널, 싱크에 따라 달라집니다. 각 구성 요소별로 노출되는 메트릭의 전체 목록을 보려면 Apache Flume 설명서에서 사용 가능한 구성 요소 메트릭을 참고하세요. Datadog에서 볼 수 있는 메트릭 목록을 보려면 이 페이지의 메트릭 섹션을 참고하세요.

수집한 데이터

메트릭

flume.channel.capacity
(gauge)
The maximum number of events that can be queued in the channel at any time. For channel types without a capacity limit the value will be zero.
Shown as event
flume.channel.fill_percentage
(gauge)
The channel fill percentage.
Shown as percent
flume.channel.size
(gauge)
The number of events currently queued in the channel.
Shown as event
flume.channel.event_put_attempt_count
(count)
The total number of events that have been attempted to be put into the channel.
Shown as event
flume.channel.event_put_success_count
(count)
The total number of events that have successfully been put into the channel.
Shown as event
flume.channel.event_take_attempt_count
(count)
The total number of attempts that have been made to take an event from the channel.
Shown as event
flume.channel.event_take_success_count
(count)
The total number of events that have successfully been taken from the channel.
Shown as event
flume.channel.kafka_commit_timer
(gauge)
The timer for the Kafka channel commits.
Shown as time
flume.channel.kafka_event_get_timer
(gauge)
The timer for the kafka channel retrieving events.
Shown as time
flume.channel.kafka_event_send_timer
(gauge)
The timer for the Kafka channel sending events.
Shown as time
flume.channel.rollbackcount
(count)
The count of rollbacks from the kafka channel.
Shown as event
flume.sink.event_write_fail
(count)
The total number of failed write events.
Shown as event
flume.sink.batch_empty_count
(count)
The number of append batches attempted containing zero events.
Shown as event
flume.sink.channel_read_fail
(count)
The number of failed read events from the channel.
Shown as event
flume.sink.batch_complete_count
(count)
The number of append batches attempted containing the maximum number of events supported by the next hop.
Shown as event
flume.sink.batch_underflow_count
(count)
The number of append batches attempted containing less than the maximum number of events supported by the next hop.
Shown as event
flume.sink.connection_closed_count
(count)
The number of connections closed by this sink.
Shown as connection
flume.sink.connection_failed_count
(count)
The number of failed connections.
Shown as connection
flume.sink.connection_created_count
(count)
The number of connections created by this sink. Only applicable to some sink types.
Shown as connection
flume.sink.event_drain_attempt_count
(count)
The total number of events that have been attempted to be drained to the next hop.
Shown as event
flume.sink.event_drain_success_count
(count)
The total number of events that have successfully been drained to the next hop
Shown as event
flume.sink.kafka_event_sent_timer
(gauge)
The timer for the Kafka sink sending events.
Shown as time
flume.sink.rollbackcount
(gauge)
The count of rollbacks from the Kafka sink.
Shown as event
flume.source.event_read_fail
(count)
The total number of failed read source events.
Shown as event
flume.source.channel_write_fail
(count)
The total number of failed channel write events.
Shown as event
flume.source.event_accepted_count
(count)
The total number of events successfully accepted, either through append batches or single-event appends.
Shown as event
flume.source.event_received_count
(count)
The total number of events received, either through append batches or single-event appends.
Shown as event
flume.source.append_accepted_count
(count)
The total number of single-event appends successfully accepted.
Shown as event
flume.source.append_received_count
(count)
The total number of single-event appends received.
Shown as event
flume.source.open_connection_count
(count)
The number of open connections
Shown as connection
flume.source.generic_processing_fail
(count)
The total number of generic processing failures.
Shown as event
flume.source.append_batch_accepted_count
(count)
The total number of append batches accepted successfully.
Shown as event
flume.source.append_batch_received_count
(count)
The total number of append batches received.
Shown as event
flume.source.kafka_commit_timer
(gauge)
The timer for the Kafka source committing events.
Shown as time
flume.source.kafka_empty_count
(count)
The count of empty events from the Kafka source.
Shown as event
flume.source.kafka_event_get_timer
(gauge)
The timer for the Kafka source retrieving events.
Shown as time

이벤트

Flume에는 이벤트가 포함되어 있지 않습니다.

서비스 점검

flume.can_connect
Returns CRITICAL if the Agent is unable to connect to and collect metrics from the monitored Flume instance. Returns OK otherwise.
Statuses: ok, critical

트러블슈팅

도움이 필요하신가요? Datadog 지원팀에 문의하세요.