Supported OS

Versión de la integración4.0.0

Información general

Este check monitoriza Hudi. Es compatible con versiones de Hudi 0.10.0 y posteriores.

Configuración

Instalación

El check de check está incluido en el paquete del Datadog Agent. No es necesaria ninguna instalación adicional en tu servidor.

Configuración

  1. Configura el Reportador de métricas JMX en Hudi:

    hoodie.metrics.on=true
    hoodie.metrics.reporter.type=JMX
    hoodie.metrics.jmx.host=<JMX_HOST>
    hoodie.metrics.jmx.port=<JMX_PORT>
    
  2. Edita el archivo hudi.d/conf.yaml, que se encuentra en la carpeta conf.d/ en la raíz del directorio de configuración del Agent para empezar a recopilar tus datos de rendimiento de Hudi. Para conocer todas las opciones de configuración disponibles, consulta el hudi.d/conf.yaml de ejemplo.

    Este check tiene un límite de 350 métricas por instancia. El número de métricas devueltas se indica al ejecutar el comando de estado del Datadog Agent. Puedes especificar las métricas que te interesan editando la configuración.. Para saber cómo personalizar las métricas que se van a recopilar,, consulta la documentación de checks de JMX para obtener instrucciones más detalladas. Si necesitas monitorizar más métricas, ponte en contacto con el servicio de asistencia de Datadog.

  3. Reinicia el Agent.

Validación

Ejecuta el subcomando status del Agent y busca hudi en la sección Checks.

Datos recopilados

Métricas

hudi.action.bytes_written
(rate)
The total amount of bytes written in an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as byte
hudi.action.commit_time
(gauge)
The commit time of an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as millisecond
hudi.action.compacted_records_updated
(rate)
The amount of compacted records updated in an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as record
hudi.action.create_time
(rate)
The creation time of an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as millisecond
hudi.action.duration
(gauge)
The amount of time it took to successfully perform an action on a batch of records (commit, deltacommit, replacecommit, compaction, etc)
Shown as millisecond
hudi.action.files_inserted
(rate)
The amount of files inserted (commit, deltacommit, replacecommit, compaction, etc)
Shown as file
hudi.action.files_updated
(rate)
The amount of files updated (commit, deltacommit, replacecommit, compaction, etc)
Shown as file
hudi.action.insert_records_written
(rate)
The number of insert records written in an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as record
hudi.action.log_files_compacted
(rate)
The number of log files compacted in an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as file
hudi.action.log_files_size
(rate)
The size of all the log files in an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as byte
hudi.action.partitions_written
(rate)
The number of partitions written in an action (commit, deltacommit, replacecommit, compaction, etc)
hudi.action.records_written
(rate)
The number of records written in an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as record
hudi.action.scan_time
(rate)
The total time spent scanned in an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as millisecond
hudi.action.time.50th_percentile
(gauge)
Measures 50th percentile of time to complete the action (commit, deltacommit, replacecommit, compaction, etc)
Shown as nanosecond
hudi.action.time.75th_percentile
(gauge)
Measures 75th percentile of time to complete an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as nanosecond
hudi.action.time.95th_percentile
(gauge)
Measures 95th percentile of time to complete an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as nanosecond
hudi.action.time.98th_percentile
(gauge)
Measures 98th percentile of time to complete an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as nanosecond
hudi.action.time.999th_percentile
(gauge)
Measures 999th percentile of time to complete an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as nanosecond
hudi.action.time.99th_percentile
(gauge)
Measures 99th percentile of time to complete an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as nanosecond
hudi.action.time.count
(rate)
Measures count of times to complete an action (commit, deltacommit, replacecommit, compaction, etc)
hudi.action.time.max
(gauge)
Measures maximum amount of time to complete an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as nanosecond
hudi.action.time.mean
(gauge)
Measures mean amount of time to complete an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as nanosecond
hudi.action.time.min
(gauge)
Measures minimum amount of time to complete an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as nanosecond
hudi.action.time.std_dev
(gauge)
Measures standard deviation of time to complete an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as nanosecond
hudi.action.update_records_written
(rate)
The amount of update records written in an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as record
hudi.action.upsert_time
(rate)
The upsert time of an action (commit, deltacommit, replacecommit, compaction, etc)
Shown as millisecond
hudi.clean.duration
(gauge)
The total time spent cleaning
Shown as millisecond
hudi.clean.files_deleted
(gauge)
The number of files deleted in cleans
Shown as file
hudi.finalize.duration
(gauge)
The total time spent finalizing
Shown as millisecond
hudi.finalize.files_finalized
(gauge)
The number of files finalized"
Shown as file
hudi.index.command.duration
(gauge)
The time spent performing an index command (UPSERT, INSERT_OVERWRITE, etc)
Shown as millisecond
hudi.rollback.duration
(gauge)
The total time spent in rollback
Shown as millisecond
hudi.rollback.files_deleted
(gauge)
The number of files deleted in rollback
Shown as file

Recopilación de logs

Disponible para la versión 6.0 o posteriores del Agent

  1. Hudi utiliza el generador de logs log4j por defecto. Para personalizar el formato, edita el archivo log4j.properties en tu directorio conf de Flink o Spark. Un ejemplo de archivo log4j.properties es:

     log4j.rootCategory=INFO, file
     log4j.appender.file=org.apache.log4j.FileAppender
     log4j.appender.file.File=/var/log/hudi.log
     log4j.appender.file.append=false
     log4j.appender.file.layout=org.apache.log4j.PatternLayout
     log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n
    
  2. Por defecto, el pipeline de integración de Datadog admite el siguiente patrón de conversión:

    %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n
    

    Un ejemplo de marca de tiempo válida es: 2020-02-03 18:43:12,251.

    Clona y edita el pipeline de la integración si tienes un formato diferente.

  3. La recopilación de logs se encuentra deshabilitada de manera predeterminada en el Datadog Agent. Habilítala en tu archivo datadog.yaml:

    logs_enabled: true
    
  4. Descomenta y edita el bloque de configuración de logs en tu archivo hudi.d/conf.yaml. Cambia los valores de los parámetros path y service en función de tu entorno. Consulta el hudi.d/conf.yaml de ejemplo para conocer todas las opciones de configuración disponibles.

    logs:
      - type: file
        path: /var/log/hudi.log
        source: hudi
        log_processing_rules:
          - type: multi_line
            pattern: \d{4}\-(0?[1-9]|1[012])\-(0?[1-9]|[12][0-9]|3[01])
            name: new_log_start_with_date
    

Eventos

La integración Hudi no incluye eventos.

Checks de servicio

hudi.can_connect
Returns CRITICAL if the Agent is unable to connect to and collect metrics from the monitored Hudi instance, WARNING if no metrics are collected, and OK otherwise.
Statuses: ok, critical, warning

Solucionar problemas

¿Necesitas ayuda? Consulta el servicio de asistencia de Datadog.