- 重要な情報
- はじめに
- 用語集
- ガイド
- エージェント
- インテグレーション
- OpenTelemetry
- 開発者
- API
- CoScreen
- アプリ内
- Service Management
- インフラストラクチャー
- アプリケーションパフォーマンス
- 継続的インテグレーション
- ログ管理
- セキュリティ
- UX モニタリング
- 管理
Google Cloud Vertex AI empowers machine learning developers, data scientists, and data engineers to take their projects from ideation to deployment, quickly and cost-effectively. Train high-quality custom machine learning models with minimal machine learning expertise and effort.
Google Cloud Vertex AI is included in the Google Cloud Platform integration package. If you haven’t already, set up the Google Cloud Platform integration first to begin collecting out-of-the-box metrics.
To collect Vertex AI labels as tags, enable the Cloud Asset Viewer role.
You can use service account impersonation and automatic project discovery to integrate Datadog with Google Cloud .
This method enables you to monitor all projects visible to a service account by assigning IAM roles in the relevant projects. You can assign these roles to projects individually, or you can configure Datadog to monitor groups of projects by assigning these roles at the organization or folder level. Assigning roles in this way allows Datadog to automatically discover and monitor all projects in the given scope, including any new projects that may be added to the group in the future.
Google Cloud Vertex AI logs are collected with Google Cloud Logging and sent to a Cloud pub/sub with an HTTP push forwarder. If you haven’t already, set up a Cloud pub/sub with an HTTP push forwarder .
Once this is done, export your Google Cloud Vertex AI logs from Google Cloud Logging to the pub/sub:
gcp.aiplatform.prediction.online.cpu.utilization (gauge) | Fraction of CPU allocated by the deployed model replica and currently in use. May exceed 100% if the machine type has multiple CPUs. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as fraction |
gcp.aiplatform.prediction.online.memory.bytes_used (gauge) | Amount of memory allocated by the deployed model replica and currently in use. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as byte |
gcp.aiplatform.prediction.online.prediction_latencies.samplecount (count) | Online prediction latency of the public deployed model. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as microsecond |
gcp.aiplatform.prediction.online.prediction_latencies.avg (gauge) | Average Online prediction latency of the deployed model. Shown as microsecond |
gcp.aiplatform.prediction.online.prediction_count (count) | Number of online predictions. Shown as prediction |
gcp.aiplatform.prediction.online.network.sent_bytes_count (count) | Number of bytes sent over the network by the deployed model replica. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as byte |
gcp.aiplatform.prediction.online.network.received_bytes_count (count) | Number of bytes received over the network by the deployed model replica. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as byte |
gcp.aiplatform.prediction.online.target_replicas (count) | Target number of active replicas needed for the deployed model. Sampled every 60 seconds. After sampling data is not visible for up to 120 seconds. Shown as worker |
gcp.aiplatform.prediction.online.replicas (count) | Number of active replicas used by the deployed model. Sampled every 60 seconds. After sampling data is not visible for up to 120 seconds. Shown as worker |
gcp.aiplatform.prediction.online.response_count (count) | Number of different online prediction response codes. Shown as response |
gcp.aiplatform.prediction.online.error_count (count) | Number of online prediction errors. Shown as error |
gcp.aiplatform.online_prediction_requests_per_base_model (count) | Online prediction requests per minute per project per base model. Shown as request |
gcp.aiplatform.prediction.online.accelerator.duty_cycle (gauge) | Fraction of CPU allocated by the deployed model replica and currently in use. May exceed 100% if the machine type has multiple CPUs. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as fraction |
gcp.aiplatform.prediction.online.accelerator.memory.bytes_used (gauge) | Amount of accelerator memory allocated by the deployed model replica. Shown as byte |
gcp.aiplatform.prediction.online.private.prediction_latencies.avg (gauge) | Average Online prediction latency of the private deployed model. Shown as microsecond |
gcp.aiplatform.prediction.online.private.prediction_latencies.samplecount (count) | Online prediction latency of the private deployed model. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as microsecond |
gcp.aiplatform.prediction.online.private.response_count (count) | Online prediction response count of the private deployed model. Shown as response |
gcp.aiplatform.quota.online_prediction_requests_per_base_model.exceeded (count) | Number of attempts to exceed the limit on quota metric aiplatform.googleapis.com/onlinepredictionrequestsperbase_model. Shown as error |
gcp.aiplatform.quota.online_prediction_requests_per_base_model.limit (gauge) | Current limit on quota metric aiplatform.googleapis.com/onlinepredictionrequestsperbase_model. Shown as request |
gcp.aiplatform.quota.online_prediction_requests_per_base_model.usage (count) | Current usage on quota metric aiplatform.googleapis.com/onlinepredictionrequestsperbase_model. Shown as request |
Google Cloud Vertex AI does not include any service checks.
Google Cloud Vertex AI does not include any events.
Need help? Contact Datadog support .
お役に立つドキュメント、リンクや記事: