Google Cloud Machine Learning es un servicio gestionado que permite crear fácilmente modelos de Machine Learning, que funcionen con cualquier tipo de datos, de cualquier tamaño.
Obtén métricas de Google Machine Learning para:
- Visualizar el rendimiento de tus servicios de ML.
- Correlacionar el rendimiento de tus servicios de ML con tus aplicaciones.
Ajuste
Instalación
Si aún no lo has hecho, configura la integración Google Cloud Platform. No es necesario realizar ningún otro paso de instalación.
APM
Los logs de Google Cloud Machine Learning se recopilan con Google Cloud Logging y se envían a una tarea de Dataflow a través de un tema Cloud Pub/Sub. Si aún no lo has hecho, configura la generación de logs con la plantilla Dataflow de Datadog.
Una vez hecho esto, exporta tus logs de Google Cloud Machine Learning de Google Cloud Logging al tema Pub/Sub:
- Ve a la página de Google Cloud Logging y filtra logs de Google Cloud Machine Learning.
- Haz clic en Create Export (Crear exportación) y asigna un nombre al sumidero.
- Elige “Cloud Pub/Sub” como destino y selecciona el tema Pub/Sub creado para tal fin. Nota: El tema Pub/Sub puede encontrarse en un proyecto diferente.
- Haz clic en Create (Crear) y espera a que aparezca el mensaje de confirmación.
Datos recopilados
Métricas
gcp.ml.prediction.error_count (count) | Cumulative count of prediction errors. |
gcp.ml.prediction.latencies.avg (count) | The average latency of a certain type. Shown as microsecond |
gcp.ml.prediction.latencies.samplecount (count) | The sample count for latency of a certain type. Shown as microsecond |
gcp.ml.prediction.latencies.sumsqdev (count) | The sum of squared deviation for latency of a certain type. Shown as microsecond |
gcp.ml.prediction.online.accelerator.duty_cycle (gauge) | Average fraction of time over the past sample period during which the accelerator(s) were actively processing. Shown as percent (multiplied by 100) |
gcp.ml.prediction.online.accelerator.memory.bytes_used (gauge) | Amount of accelerator memory allocated by the model replica. Shown as byte |
gcp.ml.prediction.online.cpu.utilization (gauge) | Fraction of CPU allocated by the model replica and currently in use. May exceed 100% if the machine type has multiple CPUs. Shown as percent (multiplied by 100) |
gcp.ml.prediction.online.memory.bytes_used (gauge) | Amount of memory allocated by the model replica and currently in use. Shown as byte |
gcp.ml.prediction.online.network.bytes_received (count) | Number of bytes received over the network by the model replica. Shown as byte |
gcp.ml.prediction.online.network.bytes_sent (count) | Number of bytes sent over the network by the model replica. Shown as byte |
gcp.ml.prediction.online.replicas (gauge) | Number of active model replicas. |
gcp.ml.prediction.online.target_replicas (gauge) | Aspired number of active model replicas. |
gcp.ml.prediction.prediction_count (count) | Cumulative count of predictions. |
gcp.ml.prediction.response_count (count) | Cumulative count of different response codes. |
gcp.ml.training.accelerator.memory.utilization (gauge) | Fraction of allocated accelerator memory that is currently in use. Values are numbers between 0.0 and 1.0, charts display the values as a percentage between 0% and 100%. Shown as percent (multiplied by 100) |
gcp.ml.training.accelerator.utilization (gauge) | Fraction of allocated accelerator that is currently in use. Values are numbers between 0.0 and 1.0, charts display the values as a percentage between 0% and 100%. Shown as percent (multiplied by 100) |
gcp.ml.training.cpu.utilization (gauge) | Fraction of allocated CPU that is currently in use. Values are numbers between 0.0 and 1.0, charts display the values as a percentage between 0% and 100%. Shown as percent (multiplied by 100) |
gcp.ml.training.memory.utilization (gauge) | Fraction of allocated memory that is currently in use. Values are numbers between 0.0 and 1.0, charts display the values as a percentage between 0% and 100%. Shown as percent (multiplied by 100) |
gcp.ml.training.network.received_bytes_count (count) | Number of bytes received by the training job over the network. Shown as byte |
gcp.ml.training.network.sent_bytes_count (count) | Number of bytes sent by the training job over the network. Shown as byte |
Eventos
La integración Google Cloud Machine Learning no incluye eventos.
Checks de servicio
La integración Google Cloud Machine Learning no incluye checks de servicio.
Solucionar problemas
¿Necesitas ayuda? Ponte en contacto con el servicio de asistencia de Datadog.
Lectura adicional
Más enlaces, artículos y documentación útiles: