- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Google Cloud Vertex AI empowers machine learning developers, data scientists, and data engineers to take their projects from ideation to deployment, quickly and cost-effectively. Train high-quality custom machine learning models with minimal machine learning expertise and effort.
Google Cloud Vertex AI is included in the Google Cloud Platform integration package. If you haven’t already, set up the Google Cloud Platform integration first to begin collecting out-of-the-box metrics.
To collect Vertex AI labels as tags, enable the Cloud Asset Viewer role.
You can use service account impersonation and automatic project discovery to integrate Datadog with Google Cloud.
This method enables you to monitor all projects visible to a service account by assigning IAM roles in the relevant projects. You can assign these roles to projects individually, or you can configure Datadog to monitor groups of projects by assigning these roles at the organization or folder level. Assigning roles in this way allows Datadog to automatically discover and monitor all projects in the given scope, including any new projects that may be added to the group in the future.
Google Cloud Vertex AI logs are collected with Google Cloud Logging and sent to a Dataflow job through a Cloud Pub/Sub topic. If you haven’t already, set up logging with the Datadog Dataflow template.
Once this is done, export your Google Cloud Vertex AI logs from Google Cloud Logging to the Pub/Sub topic:
gcp.aiplatform.executing_vertexai_pipeline_jobs (gauge) | Number of pipeline jobs being executed. |
gcp.aiplatform.executing_vertexai_pipeline_tasks (gauge) | Number of pipeline tasks being executed. |
gcp.aiplatform.featureonlinestore.online_serving.request_count (count) | Number of requests received. |
gcp.aiplatform.featureonlinestore.online_serving.serving_bytes_count (count) | Serving response bytes count. Shown as byte |
gcp.aiplatform.featureonlinestore.online_serving.serving_latencies.avg (count) | The average server side request latency. Shown as millisecond |
gcp.aiplatform.featureonlinestore.online_serving.serving_latencies.samplecount (count) | The sample count for server side request latency. Shown as millisecond |
gcp.aiplatform.featureonlinestore.online_serving.serving_latencies.sumsqdev (count) | The sum of squared deviation for server side request latency. Shown as millisecond |
gcp.aiplatform.featureonlinestore.running_sync (gauge) | Number of running syncs at given point of time. |
gcp.aiplatform.featureonlinestore.serving_data_ages.avg (count) | The average measure of the serving data age in seconds. Current time minus synced time. Shown as second |
gcp.aiplatform.featureonlinestore.serving_data_ages.samplecount (count) | The sample count for measure of the serving data age in seconds. Current time minus synced time. Shown as second |
gcp.aiplatform.featureonlinestore.serving_data_ages.sumsqdev (count) | The sum of squared deviation for measure of the serving data age in seconds. Current time minus synced time. Shown as second |
gcp.aiplatform.featureonlinestore.serving_data_by_sync_time (gauge) | Breakdown of data in Feature Online Store by synced timestamp. |
gcp.aiplatform.featureonlinestore.storage.bigtable_cpu_load (gauge) | The average CPU load of nodes in the Feature Online Store. Shown as percent (multiplied by 100) |
gcp.aiplatform.featureonlinestore.storage.bigtable_cpu_load_hottest_node (gauge) | The CPU load of the hottest node in the Feature Online Store. Shown as percent (multiplied by 100) |
gcp.aiplatform.featureonlinestore.storage.bigtable_nodes (gauge) | The number of nodes for the Feature Online Store (Bigtable). |
gcp.aiplatform.featureonlinestore.storage.multi_region_bigtable_cpu_load (gauge) | The average CPU load of nodes in the Feature Online Store with multi-regional replicas. Shown as percent (multiplied by 100) |
gcp.aiplatform.featureonlinestore.storage.multi_region_bigtable_nodes (gauge) | The number of nodes for the Feature Online Store (Bigtable) with multi-regional replicas. |
gcp.aiplatform.featureonlinestore.storage.optimized_nodes (gauge) | The number of nodes for the Feature Online Store (Optimized). |
gcp.aiplatform.featureonlinestore.storage.stored_bytes (gauge) | Bytes stored in the Feature Online Store. Shown as byte |
gcp.aiplatform.featurestore.cpu_load (gauge) | The average CPU load for a node in the Featurestore online storage. Shown as percent (multiplied by 100) |
gcp.aiplatform.featurestore.cpu_load_hottest_node (gauge) | The CPU load for the hottest node in the Featurestore online storage. Shown as percent (multiplied by 100) |
gcp.aiplatform.featurestore.node_count (gauge) | The number of nodes for the Featurestore online storage. |
gcp.aiplatform.featurestore.online_entities_updated (count) | Number of entities updated on the Featurestore online storage. Shown as byte |
gcp.aiplatform.featurestore.online_serving.latencies.avg (count) | The average online serving latencies by EntityType. Shown as millisecond |
gcp.aiplatform.featurestore.online_serving.latencies.samplecount (count) | The sample count for online serving latencies by EntityType. Shown as millisecond |
gcp.aiplatform.featurestore.online_serving.latencies.sumsqdev (count) | The sum of squared deviation for online serving latencies by EntityType. Shown as millisecond |
gcp.aiplatform.featurestore.online_serving.request_bytes_count (count) | Request size by EntityType. Shown as byte |
gcp.aiplatform.featurestore.online_serving.request_count (count) | Featurestore online serving count by EntityType. |
gcp.aiplatform.featurestore.online_serving.response_size (count) | Response size by EntityType. Shown as byte |
gcp.aiplatform.featurestore.storage.billable_processed_bytes (gauge) | Number of bytes billed for offline data processed. Shown as byte |
gcp.aiplatform.featurestore.storage.stored_bytes (gauge) | Bytes stored in Featurestore. Shown as byte |
gcp.aiplatform.featurestore.streaming_write.offline_processed_count (count) | Number of streaming write requests processed for offline storage. |
gcp.aiplatform.featurestore.streaming_write.offline_write_delays.avg (count) | The average time (in seconds) since the write API is called until it is written to offline storage. Shown as second |
gcp.aiplatform.featurestore.streaming_write.offline_write_delays.samplecount (count) | The sample count for time (in seconds) since the write API is called until it is written to offline storage. Shown as second |
gcp.aiplatform.featurestore.streaming_write.offline_write_delays.sumsqdev (count) | The sum of squared deviation for time (in seconds) since the write API is called until it is written to offline storage. Shown as second |
gcp.aiplatform.generate_content_input_tokens_per_minute_per_base_model (count) | Generate content input tokens per minute per project per base model. |
gcp.aiplatform.generate_content_requests_per_minute_per_project_per_base_model (count) | Generate content requests per minute per project per base model. |
gcp.aiplatform.matching_engine.cpu.request_utilization (gauge) | The fraction of the requested CPU that is currently in use on a match server container. Shown as percent (multiplied by 100) |
gcp.aiplatform.matching_engine.current_replicas (gauge) | Number of active replicas used by the DeployedIndex. |
gcp.aiplatform.matching_engine.current_shards (gauge) | Number of shards of the DeployedIndex. |
gcp.aiplatform.matching_engine.memory.used_bytes (gauge) | The memory used in bytes for a match server container. Shown as byte |
gcp.aiplatform.matching_engine.query.latencies.avg (count) | The average server side request latency. Shown as millisecond |
gcp.aiplatform.matching_engine.query.latencies.samplecount (count) | The sample count for server side request latency. Shown as millisecond |
gcp.aiplatform.matching_engine.query.latencies.sumsqdev (count) | The sum of squared deviation for server side request latency. Shown as millisecond |
gcp.aiplatform.matching_engine.query.request_count (count) | Number of requests received. |
gcp.aiplatform.matching_engine.stream_update.datapoint_count (count) | Number of successfully upserted or removed datapoints. |
gcp.aiplatform.matching_engine.stream_update.latencies.avg (count) | The average the latencies between the user receives a UpsertDatapointsResponse or RemoveDatapointsResponse and that update takes effect. Shown as millisecond |
gcp.aiplatform.matching_engine.stream_update.latencies.samplecount (count) | The sample count for the latencies between the user receives a UpsertDatapointsResponse or RemoveDatapointsResponse and that update takes effect. Shown as millisecond |
gcp.aiplatform.matching_engine.stream_update.latencies.sumsqdev (count) | The sum of squared deviation for the latencies between the user receives a UpsertDatapointsResponse or RemoveDatapointsResponse and that update takes effect. Shown as millisecond |
gcp.aiplatform.matching_engine.stream_update.request_count (count) | Number of stream update requests. |
gcp.aiplatform.online_prediction_dedicated_requests_per_base_model_version (count) | Online prediction dedicated requests per minute per project per base model version. |
gcp.aiplatform.online_prediction_dedicated_tokens_per_base_model_version (count) | Online prediction dedicated tokens per minute per project per base model version. |
gcp.aiplatform.online_prediction_requests_per_base_model (count) | Online prediction requests per minute per project per base model. Shown as request |
gcp.aiplatform.online_prediction_tokens_per_minute_per_base_model (count) | Online prediction tokens per minute per project per base model. |
gcp.aiplatform.pipelinejob.duration (gauge) | Runtime seconds of the pipeline job being executed (from creation to end). Shown as second |
gcp.aiplatform.pipelinejob.task_completed_count (count) | Cumulative number of completed PipelineTasks. |
gcp.aiplatform.prediction.online.accelerator.duty_cycle (gauge) | Fraction of CPU allocated by the deployed model replica and currently in use. May exceed 100% if the machine type has multiple CPUs. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as fraction |
gcp.aiplatform.prediction.online.accelerator.memory.bytes_used (gauge) | Amount of accelerator memory allocated by the deployed model replica. Shown as byte |
gcp.aiplatform.prediction.online.cpu.utilization (gauge) | Fraction of CPU allocated by the deployed model replica and currently in use. May exceed 100% if the machine type has multiple CPUs. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as fraction |
gcp.aiplatform.prediction.online.deployment_resource_pool.accelerator.duty_cycle (gauge) | Average fraction of time over the past sample period during which the accelerator(s) were actively processing. Shown as percent (multiplied by 100) |
gcp.aiplatform.prediction.online.deployment_resource_pool.accelerator.memory.bytes_used (gauge) | Amount of accelerator memory allocated by the deployment resource pool replica. Shown as byte |
gcp.aiplatform.prediction.online.deployment_resource_pool.cpu.utilization (gauge) | Fraction of CPU allocated by the deployment resource pool replica and currently in use. May exceed 100% if the machine type has multiple CPUs. Shown as percent (multiplied by 100) |
gcp.aiplatform.prediction.online.deployment_resource_pool.memory.bytes_used (gauge) | Amount of memory allocated by the deployment resource pool replica and currently in use. Shown as byte |
gcp.aiplatform.prediction.online.deployment_resource_pool.network.received_bytes_count (count) | Number of bytes received over the network by the deployment resource pool replica. Shown as byte |
gcp.aiplatform.prediction.online.deployment_resource_pool.network.sent_bytes_count (count) | Number of bytes sent over the network by the deployment resource pool replica. Shown as byte |
gcp.aiplatform.prediction.online.deployment_resource_pool.replicas (gauge) | Number of active replicas used by the deployment resource pool. |
gcp.aiplatform.prediction.online.deployment_resource_pool.target_replicas (gauge) | Target number of active replicas needed for the deployment resource pool. |
gcp.aiplatform.prediction.online.error_count (count) | Number of online prediction errors. Shown as error |
gcp.aiplatform.prediction.online.memory.bytes_used (gauge) | Amount of memory allocated by the deployed model replica and currently in use. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as byte |
gcp.aiplatform.prediction.online.network.received_bytes_count (count) | Number of bytes received over the network by the deployed model replica. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as byte |
gcp.aiplatform.prediction.online.network.sent_bytes_count (count) | Number of bytes sent over the network by the deployed model replica. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as byte |
gcp.aiplatform.prediction.online.prediction_count (count) | Number of online predictions. Shown as prediction |
gcp.aiplatform.prediction.online.prediction_latencies.avg (gauge) | Average Online prediction latency of the deployed model. Shown as microsecond |
gcp.aiplatform.prediction.online.prediction_latencies.samplecount (count) | Online prediction latency of the public deployed model. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as microsecond |
gcp.aiplatform.prediction.online.private.prediction_latencies.avg (gauge) | Average Online prediction latency of the private deployed model. Shown as microsecond |
gcp.aiplatform.prediction.online.private.prediction_latencies.samplecount (count) | Online prediction latency of the private deployed model. Sampled every 60 seconds. After sampling data is not visible for up to 360 seconds. Shown as microsecond |
gcp.aiplatform.prediction.online.private.response_count (count) | Online prediction response count of the private deployed model. Shown as response |
gcp.aiplatform.prediction.online.replicas (count) | Number of active replicas used by the deployed model. Sampled every 60 seconds. After sampling data is not visible for up to 120 seconds. Shown as worker |
gcp.aiplatform.prediction.online.response_count (count) | Number of different online prediction response codes. Shown as response |
gcp.aiplatform.prediction.online.target_replicas (count) | Target number of active replicas needed for the deployed model. Sampled every 60 seconds. After sampling data is not visible for up to 120 seconds. Shown as worker |
gcp.aiplatform.publisher.online_serving.character_count (count) | Accumulated input/output character count. |
gcp.aiplatform.publisher.online_serving.characters.avg (count) | The average input/output character count distribution. |
gcp.aiplatform.publisher.online_serving.characters.samplecount (count) | The sample count for input/output character count distribution. |
gcp.aiplatform.publisher.online_serving.characters.sumsqdev (count) | The sum of squared deviation for input/output character count distribution. |
gcp.aiplatform.publisher.online_serving.consumed_throughput (count) | Overall throughput used (accounting for burndown rate) in terms of characters. |
gcp.aiplatform.publisher.online_serving.first_token_latencies.avg (count) | The average duration from request received to first token sent back to the client. Shown as millisecond |
gcp.aiplatform.publisher.online_serving.first_token_latencies.samplecount (count) | The sample count for duration from request received to first token sent back to the client. Shown as millisecond |
gcp.aiplatform.publisher.online_serving.first_token_latencies.sumsqdev (count) | The sum of squared deviation for duration from request received to first token sent back to the client. Shown as millisecond |
gcp.aiplatform.publisher.online_serving.model_invocation_count (count) | Number of model invocations (prediction requests). |
gcp.aiplatform.publisher.online_serving.model_invocation_latencies.avg (count) | The average model invocation latencies (prediction latencies). Shown as millisecond |
gcp.aiplatform.publisher.online_serving.model_invocation_latencies.samplecount (count) | The sample count for model invocation latencies (prediction latencies). Shown as millisecond |
gcp.aiplatform.publisher.online_serving.model_invocation_latencies.sumsqdev (count) | The sum of squared deviation for model invocation latencies (prediction latencies). Shown as millisecond |
gcp.aiplatform.publisher.online_serving.token_count (count) | Accumulated input/output token count. |
gcp.aiplatform.publisher.online_serving.tokens.avg (count) | The average input/output token count distribution. |
gcp.aiplatform.publisher.online_serving.tokens.samplecount (count) | The sample count for input/output token count distribution. |
gcp.aiplatform.publisher.online_serving.tokens.sumsqdev (count) | The sum of squared deviation for input/output token count distribution. |
gcp.aiplatform.quota.generate_content_input_tokens_per_minute_per_base_model.exceeded (count) | Number of attempts to exceed the limit on quota metric aiplatform.googleapis.com/generate_content_input_tokens_per_minute_per_base_model . |
gcp.aiplatform.quota.generate_content_input_tokens_per_minute_per_base_model.limit (gauge) | Current limit on quota metric aiplatform.googleapis.com/generate_content_input_tokens_per_minute_per_base_model . |
gcp.aiplatform.quota.generate_content_input_tokens_per_minute_per_base_model.usage (count) | Current usage on quota metric aiplatform.googleapis.com/generate_content_input_tokens_per_minute_per_base_model . |
gcp.aiplatform.quota.generate_content_requests_per_minute_per_project_per_base_model.exceeded (count) | Number of attempts to exceed the limit on quota metric aiplatform.googleapis.com/generate_content_requests_per_minute_per_project_per_base_model . |
gcp.aiplatform.quota.generate_content_requests_per_minute_per_project_per_base_model.limit (gauge) | Current limit on quota metric aiplatform.googleapis.com/generate_content_requests_per_minute_per_project_per_base_model . |
gcp.aiplatform.quota.generate_content_requests_per_minute_per_project_per_base_model.usage (count) | Current usage on quota metric aiplatform.googleapis.com/generate_content_requests_per_minute_per_project_per_base_model . |
gcp.aiplatform.quota.online_prediction_dedicated_requests_per_base_model_version.exceeded (count) | Number of attempts to exceed the limit on quota metric aiplatform.googleapis.com/online_prediction_dedicated_requests_per_base_model_version . |
gcp.aiplatform.quota.online_prediction_dedicated_requests_per_base_model_version.limit (gauge) | Current limit on quota metric aiplatform.googleapis.com/online_prediction_dedicated_requests_per_base_model_version . |
gcp.aiplatform.quota.online_prediction_dedicated_requests_per_base_model_version.usage (count) | Current usage on quota metric aiplatform.googleapis.com/online_prediction_dedicated_requests_per_base_model_version . |
gcp.aiplatform.quota.online_prediction_dedicated_tokens_per_base_model_version.exceeded (count) | Number of attempts to exceed the limit on quota metric aiplatform.googleapis.com/online_prediction_dedicated_tokens_per_base_model_version . |
gcp.aiplatform.quota.online_prediction_dedicated_tokens_per_base_model_version.limit (gauge) | Current limit on quota metric aiplatform.googleapis.com/online_prediction_dedicated_tokens_per_base_model_version . |
gcp.aiplatform.quota.online_prediction_dedicated_tokens_per_base_model_version.usage (count) | Current usage on quota metric aiplatform.googleapis.com/online_prediction_dedicated_tokens_per_base_model_version . |
gcp.aiplatform.quota.online_prediction_requests_per_base_model.exceeded (count) | Number of attempts to exceed the limit on quota metric aiplatform.googleapis.com/online_prediction_requests_per_base_model .Shown as error |
gcp.aiplatform.quota.online_prediction_requests_per_base_model.limit (gauge) | Current limit on quota metric aiplatform.googleapis.com/online_prediction_requests_per_base_model .Shown as request |
gcp.aiplatform.quota.online_prediction_requests_per_base_model.usage (count) | Current usage on quota metric aiplatform.googleapis.com/online_prediction_requests_per_base_model .Shown as request |
gcp.aiplatform.quota.online_prediction_tokens_per_minute_per_base_model.exceeded (count) | Number of attempts to exceed the limit on quota metric aiplatform.googleapis.com/online_prediction_tokens_per_minute_per_base_model . |
gcp.aiplatform.quota.online_prediction_tokens_per_minute_per_base_model.limit (gauge) | Current limit on quota metric aiplatform.googleapis.com/online_prediction_tokens_per_minute_per_base_model . |
gcp.aiplatform.quota.online_prediction_tokens_per_minute_per_base_model.usage (count) | Current usage on quota metric aiplatform.googleapis.com/online_prediction_tokens_per_minute_per_base_model . |
Google Cloud Vertex AI does not include any service checks.
Google Cloud Vertex AI does not include any events.
Need help? Contact Datadog support.
추가 유용한 문서, 링크 및 기사: