Métricas de Apache Spark

Documentos > OpenTelemetry en Datadog > Integraciones > Métricas de Apache Spark

Información general

Métricas de OpenTelemetry Apache Spark en un dashboard de Spark

El receptor de Apache Spark permite recopilar métricas de Apache Spark y acceder al dashboard de Información general de Spark. Configura el receptor según las especificaciones de la última versión del apachesparkreceiver.

Para más información, consulta la documentación del proyecto de OpenTelemetry para el receptor de Apache Spark.

Configuración

Para recopilar métricas de Apache Spark con OpenTelemetry para su uso con Datadog:

Configura el receptor de Apache Spark en tu configuración de OpenTelemetry Collector.
Asegúrate de que el OpenTelemetry Collector está configurado para exportar a Datadog.

Consulta la documentación del receptor de Apache Spark para obtener información detallada sobre las opciones y requisitos de configuración.

Datos recopilados

OTEL	DATADOG	DESCRIPTION	FILTER	TRANSFORM
spark.driver.block_manager.disk.usage	spark.driver.disk_used	Disk space used by the BlockManager.		× 1048576
spark.driver.block_manager.memory.usage	spark.driver.memory_used	Memory usage for the driver’s BlockManager.		× 1048576
spark.driver.dag_scheduler.stage.count	spark.stage.count	Number of stages the DAGScheduler is either running or needs to run.
spark.executor.disk.usage	spark.executor.disk_used	Disk space used by this executor for RDD storage.
spark.executor.disk.usage	spark.rdd.disk_used	Disk space used by this executor for RDD storage.
spark.executor.memory.usage	spark.executor.memory_used	Storage memory used by this executor.
spark.executor.memory.usage	spark.rdd.memory_used	Storage memory used by this executor.
spark.job.stage.active	spark.job.num_active_stages	Number of active stages in this job.
spark.job.stage.result	spark.job.num_completed_stages	Number of stages with a specific result in this job.	`job_result`: `completed`
spark.job.stage.result	spark.job.num_failed_stages	Number of stages with a specific result in this job.	`job_result`: `failed`
spark.job.stage.result	spark.job.num_skipped_stages	Number of stages with a specific result in this job.	`job_result`: `skipped`
spark.job.task.active	spark.job.num_tasks{status: running}	Number of active tasks in this job.
spark.job.task.result	spark.job.num_skipped_tasks	Number of tasks with a specific result in this job.	`job_result`: `skipped`
spark.job.task.result	spark.job.num_failed_tasks	Number of tasks with a specific result in this job.	`job_result`: `failed`
spark.job.task.result	spark.job.num_completed_tasks	Number of tasks with a specific result in this job.	`job_result`: `completed`
spark.stage.io.records	spark.stage.input_records	Number of records written and read in this stage.	`direction`: `in`
spark.stage.io.records	spark.stage.output_records	Number of records written and read in this stage.	`direction`: `out`
spark.stage.io.size	spark.stage.input_bytes	Amount of data written and read at this stage.	`direction`: `in`
spark.stage.io.size	spark.stage.output_bytes	Amount of data written and read at this stage.	`direction`: `out`
spark.stage.shuffle.io.read.size	spark.stage.shuffle_read_bytes	Amount of data read in shuffle operations in this stage.
spark.stage.shuffle.io.records	spark.stage.shuffle_read_records	Number of records written or read in shuffle operations in this stage.	`direction`: `in`
spark.stage.shuffle.io.records	spark.stage.shuffle_write_records	Number of records written or read in shuffle operations in this stage.	`direction`: `out`