Microsoft Azure Machine Learning

Información general

El servicio Azure Machine Learning ofrece a los desarrolladores y científicos de datos una amplia gama de experiencias productivas para crear, entrenar y desplegar modelos de machine learning más rápido. Utiliza Datadog para monitorizar el rendimiento y el uso de Azure Machine Learning en contexto con el resto de tus aplicaciones e infraestructura.

Obtén métricas de Azure Machine Learning para:

  • Rastrear el número y el estado de las ejecuciones y los despliegues de modelos
  • Monitorizar la utilización de tus nodos de machine learning
  • Optimizar el rendimiento frente al coste

Ajuste

Instalación

Si aún no lo has hecho, primero configura la integración Microsoft Azure. No es necesario realizar ningún otro paso de instalación.

Datos recopilados

Métricas

azure.machinelearningservices_workspaces.completed_runs
(gauge)
The number of runs completed successfully for this workspace.
Shown as operation
azure.machinelearningservices_workspaces.started_runs
(gauge)
The number of runs started for this workspace.
Shown as operation
azure.machinelearningservices_workspaces.failed_runs
(gauge)
The number of runs failed for this workspace.
Shown as operation
azure.machinelearningservices_workspaces.model_register_succeeded
(gauge)
The number of model registrations that succeeded in this workspace.
azure.machinelearningservices_workspaces.model_register_failed
(gauge)
The number of model registrations that failed in this workspace.
azure.machinelearningservices_workspaces.model_deploy_started
(gauge)
The number of model deployments started in this workspace.
azure.machinelearningservices_workspaces.model_deploy_succeeded
(gauge)
The number of model deployments that succeeded in this workspace.
azure.machinelearningservices_workspaces.moddel_deploy_failed
(gauge)
The number of model deployments that failed in this workspace.
azure.machinelearningservices_workspaces.total_nodes
(gauge)
The number of total nodes. This total includes some of Active Nodes, Idle Nodes, Unusable Nodes, Premepted Nodes, Leaving Nodes.
Shown as node
azure.machinelearningservices_workspaces.active_nodes
(gauge)
The number of Acitve nodes. These are the nodes which are actively running a job.
Shown as node
azure.machinelearningservices_workspaces.idle_nodes
(gauge)
The number of idle nodes. Idle nodes are the nodes which are not running any jobs but can accept new job if available.
Shown as node
azure.machinelearningservices_workspaces.unusable_nodes
(gauge)
The number of unusable nodes. Unusable nodes are not functional due to some unresolvable issue. Azure will recycle these nodes.
Shown as node
azure.machinelearningservices_workspaces.preempted_nodes
(gauge)
The number of preempted nodes. These nodes are the low priority nodes which are taken away from the available node pool.
Shown as node
azure.machinelearningservices_workspaces.leaving_nodes
(gauge)
The number of leaving nodes. Leaving nodes are the nodes which just finished processing a job and will go to Idle state.
Shown as node
azure.machinelearningservices_workspaces.total_cores
(gauge)
The number of total cores.
Shown as core
azure.machinelearningservices_workspaces.active_cores
(gauge)
The number of active cores.
Shown as core
azure.machinelearningservices_workspaces.idle_cores
(gauge)
The number of idle cores.
Shown as core
azure.machinelearningservices_workspaces.unusable_cores
(gauge)
The number of unusable cores.
Shown as core
azure.machinelearningservices_workspaces.preempted_cores
(gauge)
The number of preempted cores.
Shown as core
azure.machinelearningservices_workspaces.leaving_cores
(gauge)
The number of leaving cores.
Shown as core
azure.machinelearningservices_workspaces.quota_utilization_percentage
(gauge)
The percent of quota utilized.
Shown as percent
azure.machinelearningservices_workspaces.cpuutilization
(gauge)
CPU utilization
Shown as percent
azure.machinelearningservices_workspaces.gpuutilization
(gauge)
GPU utilization
Shown as percent

Eventos

La integración Azure Machine Learning no incluye eventos.

Checks de servicio

La integración Azure Machine Learning no incluye checks de servicios.

Solucionar problemas

¿Necesitas ayuda? Ponte en contacto con el servicio de asistencia de Datadog.

Lectura adicional