---
title: Getting Started with Datadog
description: Datadog, the leading service for cloud-scale monitoring.
breadcrumbs: Docs > Infrastructure > Datadog Resource Catalog
---

# gcp_aiplatform_endpoint{% #gcp_aiplatform_endpoint %}

## `ancestors`{% #ancestors %}

**Type**: `UNORDERED_LIST_STRING` 

## `client_connection_config`{% #client_connection_config %}

**Type**: `STRUCT` **Provider name**: `clientConnectionConfig` **Description**: Configurations that are applied to the endpoint for online prediction. 

- `inference_timeout`**Type**: `STRING`**Provider name**: `inferenceTimeout`**Description**: Customizable online prediction request timeout.

## `create_time`{% #create_time %}

**Type**: `TIMESTAMP` **Provider name**: `createTime` **Description**: Output only. Timestamp when this Endpoint was created. 

## `dedicated_endpoint_dns`{% #dedicated_endpoint_dns %}

**Type**: `STRING` **Provider name**: `dedicatedEndpointDns` **Description**: Output only. DNS of the dedicated endpoint. Will only be populated if dedicated_endpoint_enabled is true. Depending on the features enabled, uid might be a random number or a string. For example, if fast_tryout is enabled, uid will be fasttryout. Format: `https://{endpoint_id}.{region}-{uid}.prediction.vertexai.goog`. 

## `dedicated_endpoint_enabled`{% #dedicated_endpoint_enabled %}

**Type**: `BOOLEAN` **Provider name**: `dedicatedEndpointEnabled` **Description**: If true, the endpoint will be exposed through a dedicated DNS [Endpoint.dedicated_endpoint_dns]. Your request to the dedicated DNS will be isolated from other users' traffic and will have better performance and reliability. Note: Once you enabled dedicated endpoint, you won't be able to send request to the shared DNS {region}-aiplatform.googleapis.com. The limitation will be removed soon. 

## `deployed_models`{% #deployed_models %}

**Type**: `UNORDERED_LIST_STRUCT` **Provider name**: `deployedModels` **Description**: Output only. The models deployed in this Endpoint. To add or remove DeployedModels use EndpointService.DeployModel and EndpointService.UndeployModel respectively. 

- `automatic_resources`**Type**: `STRUCT`**Provider name**: `automaticResources`**Description**: A description of resources that to large degree are decided by Vertex AI, and require only a modest additional configuration.
  - `max_replica_count`**Type**: `INT32`**Provider name**: `maxReplicaCount`**Description**: Immutable. The maximum number of replicas that may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale to that many replicas is guaranteed (barring service outages). If traffic increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, a no upper bound for scaling under heavy traffic will be assume, though Vertex AI may be unable to scale beyond certain replica number.
  - `min_replica_count`**Type**: `INT32`**Provider name**: `minReplicaCount`**Description**: Immutable. The minimum number of replicas that will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas up to max_replica_count, and as traffic decreases, some of these extra replicas may be freed. If the requested value is too large, the deployment will error.
- `create_time`**Type**: `TIMESTAMP`**Provider name**: `createTime`**Description**: Output only. Timestamp when the DeployedModel was created.
- `dedicated_resources`**Type**: `STRUCT`**Provider name**: `dedicatedResources`**Description**: A description of resources that are dedicated to the DeployedModel, and that need a higher degree of manual configuration.
  - `autoscaling_metric_specs`**Type**: `UNORDERED_LIST_STRUCT`**Provider name**: `autoscalingMetricSpecs`**Description**: Immutable. The metric specifications that overrides a resource utilization metric (CPU utilization, accelerator's duty cycle, and so on) target value (default to 60 if not set). At most one entry is allowed per metric. If machine_spec.accelerator_count is above 0, the autoscaling will be based on both CPU utilization and accelerator's duty cycle metrics and scale up when either metrics exceeds its target value while scale down if both metrics are under their target value. The default target value is 60 for both metrics. If machine_spec.accelerator_count is 0, the autoscaling will be based on CPU utilization metric only with default target value 60 if not explicitly set. For example, in the case of Online Prediction, if you want to override target CPU utilization to 80, you should set autoscaling_metric_specs.metric_name to `aiplatform.googleapis.com/prediction/online/cpu/utilization` and autoscaling_metric_specs.target to `80`.
    - `metric_name`**Type**: `STRING`**Provider name**: `metricName`**Description**: Required. The resource metric name. Supported metrics: * For Online Prediction: * `aiplatform.googleapis.com/prediction/online/accelerator/duty_cycle` * `aiplatform.googleapis.com/prediction/online/cpu/utilization`
    - `target`**Type**: `INT32`**Provider name**: `target`**Description**: The target resource utilization in percentage (1% - 100%) for the given metric; once the real usage deviates from the target by a certain percentage, the machine replicas change. The default value is 60 (representing 60%) if not provided.
  - `machine_spec`**Type**: `STRUCT`**Provider name**: `machineSpec`**Description**: Required. Immutable. The specification of a single machine being used.
    - `accelerator_count`**Type**: `INT32`**Provider name**: `acceleratorCount`**Description**: The number of accelerators to attach to the machine.
    - `accelerator_type`**Type**: `STRING`**Provider name**: `acceleratorType`**Description**: Immutable. The type of accelerator(s) that may be attached to the machine as per accelerator_count.**Possible values**:
      - `ACCELERATOR_TYPE_UNSPECIFIED` - Unspecified accelerator type, which means no accelerator.
      - `NVIDIA_TESLA_K80` - Deprecated: Nvidia Tesla K80 GPU has reached end of support, see [https://cloud.google.com/compute/docs/eol/k80-eol](https://cloud.google.com/compute/docs/eol/k80-eol).
      - `NVIDIA_TESLA_P100` - Nvidia Tesla P100 GPU.
      - `NVIDIA_TESLA_V100` - Nvidia Tesla V100 GPU.
      - `NVIDIA_TESLA_P4` - Nvidia Tesla P4 GPU.
      - `NVIDIA_TESLA_T4` - Nvidia Tesla T4 GPU.
      - `NVIDIA_TESLA_A100` - Nvidia Tesla A100 GPU.
      - `NVIDIA_A100_80GB` - Nvidia A100 80GB GPU.
      - `NVIDIA_L4` - Nvidia L4 GPU.
      - `NVIDIA_H100_80GB` - Nvidia H100 80Gb GPU.
      - `NVIDIA_H100_MEGA_80GB` - Nvidia H100 Mega 80Gb GPU.
      - `NVIDIA_H200_141GB` - Nvidia H200 141Gb GPU.
      - `TPU_V2` - TPU v2.
      - `TPU_V3` - TPU v3.
      - `TPU_V4_POD` - TPU v4.
      - `TPU_V5_LITEPOD` - TPU v5.
    - `machine_type`**Type**: `STRING`**Provider name**: `machineType`**Description**: Immutable. The type of the machine. See the [list of machine types supported for prediction](https://cloud.google.com/vertex-ai/docs/predictions/configure-compute#machine-types) See the [list of machine types supported for custom training](https://cloud.google.com/vertex-ai/docs/training/configure-compute#machine-types). For DeployedModel this field is optional, and the default value is `n1-standard-2`. For BatchPredictionJob or as part of WorkerPoolSpec this field is required.
    - `reservation_affinity`**Type**: `STRUCT`**Provider name**: `reservationAffinity`**Description**: Optional. Immutable. Configuration controlling how this resource pool consumes reservation.
      - `key`**Type**: `STRING`**Provider name**: `key`**Description**: Optional. Corresponds to the label key of a reservation resource. To target a SPECIFIC_RESERVATION by name, use `compute.googleapis.com/reservation-name` as the key and specify the name of your reservation as its value.
      - `reservation_affinity_type`**Type**: `STRING`**Provider name**: `reservationAffinityType`**Description**: Required. Specifies the reservation affinity type.**Possible values**:
        - `TYPE_UNSPECIFIED` - Default value. This should not be used.
        - `NO_RESERVATION` - Do not consume from any reserved capacity, only use on-demand.
        - `ANY_RESERVATION` - Consume any reservation available, falling back to on-demand.
        - `SPECIFIC_RESERVATION` - Consume from a specific reservation. When chosen, the reservation must be identified via the `key` and `values` fields.
      - `values`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `values`**Description**: Optional. Corresponds to the label values of a reservation resource. This must be the full resource name of the reservation or reservation block.
    - `tpu_topology`**Type**: `STRING`**Provider name**: `tpuTopology`**Description**: Immutable. The topology of the TPUs. Corresponds to the TPU topologies available from GKE. (Example: tpu_topology: "2x2x1").
  - `max_replica_count`**Type**: `INT32`**Provider name**: `maxReplicaCount`**Description**: Immutable. The maximum number of replicas that may be deployed on when the traffic against it increases. If the requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale to that many replicas is guaranteed (barring service outages). If traffic increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, will use min_replica_count as the default value. The value of this field impacts the charge against Vertex CPU and GPU quotas. Specifically, you will be charged for (max_replica_count * number of cores in the selected machine type) and (max_replica_count * number of GPUs per replica in the selected machine type).
  - `min_replica_count`**Type**: `INT32`**Provider name**: `minReplicaCount`**Description**: Required. Immutable. The minimum number of machine replicas that will be always deployed on. This value must be greater than or equal to 1. If traffic increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed.
  - `required_replica_count`**Type**: `INT32`**Provider name**: `requiredReplicaCount`**Description**: Optional. Number of required available replicas for the deployment to succeed. This field is only needed when partial deployment/mutation is desired. If set, the deploy/mutate operation will succeed once available_replica_count reaches required_replica_count, and the rest of the replicas will be retried. If not set, the default required_replica_count will be min_replica_count.
  - `spot`**Type**: `BOOLEAN`**Provider name**: `spot`**Description**: Optional. If true, schedule the deployment workload on [spot VMs](https://cloud.google.com/kubernetes-engine/docs/concepts/spot-vms).
- `disable_container_logging`**Type**: `BOOLEAN`**Provider name**: `disableContainerLogging`**Description**: For custom-trained Models and AutoML Tabular Models, the container of the DeployedModel instances will send `stderr` and `stdout` streams to Cloud Logging by default. Please note that the logs incur cost, which are subject to [Cloud Logging pricing](https://cloud.google.com/logging/pricing). User can disable container logging by setting this flag to true.
- `disable_explanations`**Type**: `BOOLEAN`**Provider name**: `disableExplanations`**Description**: If true, deploy the model without explainable feature, regardless the existence of Model.explanation_spec or explanation_spec.
- `enable_access_logging`**Type**: `BOOLEAN`**Provider name**: `enableAccessLogging`**Description**: If true, online prediction access logs are sent to Cloud Logging. These logs are like standard server access logs, containing information like timestamp and latency for each prediction request. Note that logs may incur a cost, especially if your project receives prediction requests at a high queries per second rate (QPS). Estimate your costs before enabling this option.
- `explanation_spec`**Type**: `STRUCT`**Provider name**: `explanationSpec`**Description**: Explanation configuration for this DeployedModel. When deploying a Model using EndpointService.DeployModel, this value overrides the value of Model.explanation_spec. All fields of explanation_spec are optional in the request. If a field of explanation_spec is not populated, the value of the same field of Model.explanation_spec is inherited. If the corresponding Model.explanation_spec is not populated, all fields of the explanation_spec will be used for the explanation configuration.
  - `metadata`**Type**: `STRUCT`**Provider name**: `metadata`**Description**: Optional. Metadata describing the Model's input and output for explanation.
    - `feature_attributions_schema_uri`**Type**: `STRING`**Provider name**: `featureAttributionsSchemaUri`**Description**: Points to a YAML file stored on Google Cloud Storage describing the format of the feature attributions. The schema is defined as an OpenAPI 3.0.2 [Schema Object](https://github.com/OAI/OpenAPI-Specification/blob/main/versions/3.0.2.md#schemaObject). AutoML tabular Models always have this field populated by Vertex AI. Note: The URI given on output may be different, including the URI scheme, than the one given on input. The output URI will point to a location where the user only has a read access.
    - `latent_space_source`**Type**: `STRING`**Provider name**: `latentSpaceSource`**Description**: Name of the source to generate embeddings for example based explanations.
  - `parameters`**Type**: `STRUCT`**Provider name**: `parameters`**Description**: Required. Parameters that configure explaining of the Model's predictions.
    - `examples`**Type**: `STRUCT`**Provider name**: `examples`**Description**: Example-based explanations that returns the nearest neighbors from the provided dataset.
      - `example_gcs_source`**Type**: `STRUCT`**Provider name**: `exampleGcsSource`**Description**: The Cloud Storage input instances.
        - `data_format`**Type**: `STRING`**Provider name**: `dataFormat`**Description**: The format in which instances are given, if not specified, assume it's JSONL format. Currently only JSONL format is supported.**Possible values**:
          - `DATA_FORMAT_UNSPECIFIED` - Format unspecified, used when unset.
          - `JSONL` - Examples are stored in JSONL files.
        - `gcs_source`**Type**: `STRUCT`**Provider name**: `gcsSource`**Description**: The Cloud Storage location for the input instances.
          - `uris`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `uris`**Description**: Required. Google Cloud Storage URI(-s) to the input file(s). May contain wildcards. For more information on wildcards, see [https://cloud.google.com/storage/docs/wildcards](https://cloud.google.com/storage/docs/wildcards).
      - `neighbor_count`**Type**: `INT32`**Provider name**: `neighborCount`**Description**: The number of neighbors to return when querying for examples.
      - `presets`**Type**: `STRUCT`**Provider name**: `presets`**Description**: Simplified preset configuration, which automatically sets configuration values based on the desired query speed-precision trade-off and modality.
        - `modality`**Type**: `STRING`**Provider name**: `modality`**Description**: The modality of the uploaded model, which automatically configures the distance measurement and feature normalization for the underlying example index and queries. If your model does not precisely fit one of these types, it is okay to choose the closest type.**Possible values**:
          - `MODALITY_UNSPECIFIED` - Should not be set. Added as a recommended best practice for enums
          - `IMAGE` - IMAGE modality
          - `TEXT` - TEXT modality
          - `TABULAR` - TABULAR modality
        - `query`**Type**: `STRING`**Provider name**: `query`**Description**: Preset option controlling parameters for speed-precision trade-off when querying for examples. If omitted, defaults to `PRECISE`.**Possible values**:
          - `PRECISE` - More precise neighbors as a trade-off against slower response.
          - `FAST` - Faster response as a trade-off against less precise neighbors.
    - `integrated_gradients_attribution`**Type**: `STRUCT`**Provider name**: `integratedGradientsAttribution`**Description**: An attribution method that computes Aumann-Shapley values taking advantage of the model's fully differentiable structure. Refer to this paper for more details: [https://arxiv.org/abs/1703.01365](https://arxiv.org/abs/1703.01365)
      - `blur_baseline_config`**Type**: `STRUCT`**Provider name**: `blurBaselineConfig`**Description**: Config for IG with blur baseline. When enabled, a linear path from the maximally blurred image to the input image is created. Using a blurred baseline instead of zero (black image) is motivated by the BlurIG approach explained here: [https://arxiv.org/abs/2004.03383](https://arxiv.org/abs/2004.03383)
        - `max_blur_sigma`**Type**: `FLOAT`**Provider name**: `maxBlurSigma`**Description**: The standard deviation of the blur kernel for the blurred baseline. The same blurring parameter is used for both the height and the width dimension. If not set, the method defaults to the zero (i.e. black for images) baseline.
      - `smooth_grad_config`**Type**: `STRUCT`**Provider name**: `smoothGradConfig`**Description**: Config for SmoothGrad approximation of gradients. When enabled, the gradients are approximated by averaging the gradients from noisy samples in the vicinity of the inputs. Adding noise can help improve the computed gradients. Refer to this paper for more details: [https://arxiv.org/pdf/1706.03825.pdf](https://arxiv.org/pdf/1706.03825.pdf)
        - `feature_noise_sigma`**Type**: `STRUCT`**Provider name**: `featureNoiseSigma`**Description**: This is similar to noise_sigma, but provides additional flexibility. A separate noise sigma can be provided for each feature, which is useful if their distributions are different. No noise is added to features that are not set. If this field is unset, noise_sigma will be used for all features.
          - `noise_sigma`**Type**: `UNORDERED_LIST_STRUCT`**Provider name**: `noiseSigma`**Description**: Noise sigma per feature. No noise is added to features that are not set.
            - `name`**Type**: `STRING`**Provider name**: `name`**Description**: The name of the input feature for which noise sigma is provided. The features are defined in explanation metadata inputs.
            - `sigma`**Type**: `FLOAT`**Provider name**: `sigma`**Description**: This represents the standard deviation of the Gaussian kernel that will be used to add noise to the feature prior to computing gradients. Similar to noise_sigma but represents the noise added to the current feature. Defaults to 0.1.
        - `noise_sigma`**Type**: `FLOAT`**Provider name**: `noiseSigma`**Description**: This is a single float value and will be used to add noise to all the features. Use this field when all features are normalized to have the same distribution: scale to range [0, 1], [-1, 1] or z-scoring, where features are normalized to have 0-mean and 1-variance. Learn more about [normalization](https://developers.google.com/machine-learning/data-prep/transform/normalization). For best results the recommended value is about 10% - 20% of the standard deviation of the input feature. Refer to section 3.2 of the SmoothGrad paper: [https://arxiv.org/pdf/1706.03825.pdf](https://arxiv.org/pdf/1706.03825.pdf). Defaults to 0.1. If the distribution is different per feature, set feature_noise_sigma instead for each feature.
        - `noisy_sample_count`**Type**: `INT32`**Provider name**: `noisySampleCount`**Description**: The number of gradient samples to use for approximation. The higher this number, the more accurate the gradient is, but the runtime complexity increases by this factor as well. Valid range of its value is [1, 50]. Defaults to 3.
      - `step_count`**Type**: `INT32`**Provider name**: `stepCount`**Description**: Required. The number of steps for approximating the path integral. A good value to start is 50 and gradually increase until the sum to diff property is within the desired error range. Valid range of its value is [1, 100], inclusively.
    - `sampled_shapley_attribution`**Type**: `STRUCT`**Provider name**: `sampledShapleyAttribution`**Description**: An attribution method that approximates Shapley values for features that contribute to the label being predicted. A sampling strategy is used to approximate the value rather than considering all subsets of features. Refer to this paper for model details: [https://arxiv.org/abs/1306.4265](https://arxiv.org/abs/1306.4265).
      - `path_count`**Type**: `INT32`**Provider name**: `pathCount`**Description**: Required. The number of feature permutations to consider when approximating the Shapley values. Valid range of its value is [1, 50], inclusively.
    - `top_k`**Type**: `INT32`**Provider name**: `topK`**Description**: If populated, returns attributions for top K indices of outputs (defaults to 1). Only applies to Models that predicts more than one outputs (e,g, multi-class Models). When set to -1, returns explanations for all outputs.
    - `xrai_attribution`**Type**: `STRUCT`**Provider name**: `xraiAttribution`**Description**: An attribution method that redistributes Integrated Gradients attribution to segmented regions, taking advantage of the model's fully differentiable structure. Refer to this paper for more details: [https://arxiv.org/abs/1906.02825](https://arxiv.org/abs/1906.02825) XRAI currently performs better on natural images, like a picture of a house or an animal. If the images are taken in artificial environments, like a lab or manufacturing line, or from diagnostic equipment, like x-rays or quality-control cameras, use Integrated Gradients instead.
      - `blur_baseline_config`**Type**: `STRUCT`**Provider name**: `blurBaselineConfig`**Description**: Config for XRAI with blur baseline. When enabled, a linear path from the maximally blurred image to the input image is created. Using a blurred baseline instead of zero (black image) is motivated by the BlurIG approach explained here: [https://arxiv.org/abs/2004.03383](https://arxiv.org/abs/2004.03383)
        - `max_blur_sigma`**Type**: `FLOAT`**Provider name**: `maxBlurSigma`**Description**: The standard deviation of the blur kernel for the blurred baseline. The same blurring parameter is used for both the height and the width dimension. If not set, the method defaults to the zero (i.e. black for images) baseline.
      - `smooth_grad_config`**Type**: `STRUCT`**Provider name**: `smoothGradConfig`**Description**: Config for SmoothGrad approximation of gradients. When enabled, the gradients are approximated by averaging the gradients from noisy samples in the vicinity of the inputs. Adding noise can help improve the computed gradients. Refer to this paper for more details: [https://arxiv.org/pdf/1706.03825.pdf](https://arxiv.org/pdf/1706.03825.pdf)
        - `feature_noise_sigma`**Type**: `STRUCT`**Provider name**: `featureNoiseSigma`**Description**: This is similar to noise_sigma, but provides additional flexibility. A separate noise sigma can be provided for each feature, which is useful if their distributions are different. No noise is added to features that are not set. If this field is unset, noise_sigma will be used for all features.
          - `noise_sigma`**Type**: `UNORDERED_LIST_STRUCT`**Provider name**: `noiseSigma`**Description**: Noise sigma per feature. No noise is added to features that are not set.
            - `name`**Type**: `STRING`**Provider name**: `name`**Description**: The name of the input feature for which noise sigma is provided. The features are defined in explanation metadata inputs.
            - `sigma`**Type**: `FLOAT`**Provider name**: `sigma`**Description**: This represents the standard deviation of the Gaussian kernel that will be used to add noise to the feature prior to computing gradients. Similar to noise_sigma but represents the noise added to the current feature. Defaults to 0.1.
        - `noise_sigma`**Type**: `FLOAT`**Provider name**: `noiseSigma`**Description**: This is a single float value and will be used to add noise to all the features. Use this field when all features are normalized to have the same distribution: scale to range [0, 1], [-1, 1] or z-scoring, where features are normalized to have 0-mean and 1-variance. Learn more about [normalization](https://developers.google.com/machine-learning/data-prep/transform/normalization). For best results the recommended value is about 10% - 20% of the standard deviation of the input feature. Refer to section 3.2 of the SmoothGrad paper: [https://arxiv.org/pdf/1706.03825.pdf](https://arxiv.org/pdf/1706.03825.pdf). Defaults to 0.1. If the distribution is different per feature, set feature_noise_sigma instead for each feature.
        - `noisy_sample_count`**Type**: `INT32`**Provider name**: `noisySampleCount`**Description**: The number of gradient samples to use for approximation. The higher this number, the more accurate the gradient is, but the runtime complexity increases by this factor as well. Valid range of its value is [1, 50]. Defaults to 3.
      - `step_count`**Type**: `INT32`**Provider name**: `stepCount`**Description**: Required. The number of steps for approximating the path integral. A good value to start is 50 and gradually increase until the sum to diff property is met within the desired error range. Valid range of its value is [1, 100], inclusively.
- `faster_deployment_config`**Type**: `STRUCT`**Provider name**: `fasterDeploymentConfig`**Description**: Configuration for faster model deployment.
  - `fast_tryout_enabled`**Type**: `BOOLEAN`**Provider name**: `fastTryoutEnabled`**Description**: If true, enable fast tryout feature for this deployed model.
- `gcp_display_name`**Type**: `STRING`**Provider name**: `displayName`**Description**: The display name of the DeployedModel. If not provided upon creation, the Model's display_name is used.
- `gcp_status`**Type**: `STRUCT`**Provider name**: `status`**Description**: Output only. Runtime status of the deployed model.
  - `available_replica_count`**Type**: `INT32`**Provider name**: `availableReplicaCount`**Description**: Output only. The number of available replicas of the deployed model.
  - `last_update_time`**Type**: `TIMESTAMP`**Provider name**: `lastUpdateTime`**Description**: Output only. The time at which the status was last updated.
  - `message`**Type**: `STRING`**Provider name**: `message`**Description**: Output only. The latest deployed model's status message (if any).
- `id`**Type**: `STRING`**Provider name**: `id`**Description**: Immutable. The ID of the DeployedModel. If not provided upon deployment, Vertex AI will generate a value for this ID. This value should be 1-10 characters, and valid characters are `/[0-9]/`.
- `model`**Type**: `STRING`**Provider name**: `model`**Description**: Required. The resource name of the Model that this is the deployment of. Note that the Model may be in a different location than the DeployedModel's Endpoint. The resource name may contain version id or version alias to specify the version. Example: `projects/{project}/locations/{location}/models/{model}@2` or `projects/{project}/locations/{location}/models/{model}@golden` if no version is specified, the default version will be deployed.
- `model_version_id`**Type**: `STRING`**Provider name**: `modelVersionId`**Description**: Output only. The version ID of the model that is deployed.
- `private_endpoints`**Type**: `STRUCT`**Provider name**: `privateEndpoints`**Description**: Output only. Provide paths for users to send predict/explain/health requests directly to the deployed model services running on Cloud via private services access. This field is populated if network is configured.
  - `explain_http_uri`**Type**: `STRING`**Provider name**: `explainHttpUri`**Description**: Output only. Http(s) path to send explain requests.
  - `health_http_uri`**Type**: `STRING`**Provider name**: `healthHttpUri`**Description**: Output only. Http(s) path to send health check requests.
  - `predict_http_uri`**Type**: `STRING`**Provider name**: `predictHttpUri`**Description**: Output only. Http(s) path to send prediction requests.
  - `service_attachment`**Type**: `STRING`**Provider name**: `serviceAttachment`**Description**: Output only. The name of the service attachment resource. Populated if private service connect is enabled.
- `service_account`**Type**: `STRING`**Provider name**: `serviceAccount`**Description**: The service account that the DeployedModel's container runs as. Specify the email address of the service account. If this service account is not specified, the container runs as a service account that doesn't have access to the resource project. Users deploying the Model must have the `iam.serviceAccounts.actAs` permission on this service account.
- `shared_resources`**Type**: `STRING`**Provider name**: `sharedResources`**Description**: The resource name of the shared DeploymentResourcePool to deploy on. Format: `projects/{project}/locations/{location}/deploymentResourcePools/{deployment_resource_pool}`
- `speculative_decoding_spec`**Type**: `STRUCT`**Provider name**: `speculativeDecodingSpec`**Description**: Optional. Spec for configuring speculative decoding.
  - `draft_model_speculation`**Type**: `STRUCT`**Provider name**: `draftModelSpeculation`**Description**: draft model speculation.
    - `draft_model`**Type**: `STRING`**Provider name**: `draftModel`**Description**: Required. The resource name of the draft model.
  - `ngram_speculation`**Type**: `STRUCT`**Provider name**: `ngramSpeculation`**Description**: N-Gram speculation.
    - `ngram_size`**Type**: `INT32`**Provider name**: `ngramSize`**Description**: The number of last N input tokens used as ngram to search/match against the previous prompt sequence. This is equal to the N in N-Gram. The default value is 3 if not specified.
  - `speculative_token_count`**Type**: `INT32`**Provider name**: `speculativeTokenCount`**Description**: The number of speculative tokens to generate at each step.

## `description`{% #description %}

**Type**: `STRING` **Provider name**: `description` **Description**: The description of the Endpoint. 

## `enable_private_service_connect`{% #enable_private_service_connect %}

**Type**: `BOOLEAN` **Provider name**: `enablePrivateServiceConnect` **Description**: Deprecated: If true, expose the Endpoint via private service connect. Only one of the fields, network or enable_private_service_connect, can be set. 

## `encryption_spec`{% #encryption_spec %}

**Type**: `STRUCT` **Provider name**: `encryptionSpec` **Description**: Customer-managed encryption key spec for an Endpoint. If set, this Endpoint and all sub-resources of this Endpoint will be secured by this key. 

- `kms_key_name`**Type**: `STRING`**Provider name**: `kmsKeyName`**Description**: Required. The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource. Has the form: `projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key`. The key needs to be in the same region as where the compute resource is created.

## `etag`{% #etag %}

**Type**: `STRING` **Provider name**: `etag` **Description**: Used to perform consistent read-modify-write updates. If not set, a blind "overwrite" update happens. 

## `gcp_display_name`{% #gcp_display_name %}

**Type**: `STRING` **Provider name**: `displayName` **Description**: Required. The display name of the Endpoint. The name can be up to 128 characters long and can consist of any UTF-8 characters. 

## `gen_ai_advanced_features_config`{% #gen_ai_advanced_features_config %}

**Type**: `STRUCT` **Provider name**: `genAiAdvancedFeaturesConfig` **Description**: Optional. Configuration for GenAiAdvancedFeatures. If the endpoint is serving GenAI models, advanced features like native RAG integration can be configured. Currently, only Model Garden models are supported. 

- `rag_config`**Type**: `STRUCT`**Provider name**: `ragConfig`**Description**: Configuration for Retrieval Augmented Generation feature.
  - `enable_rag`**Type**: `BOOLEAN`**Provider name**: `enableRag`**Description**: If true, enable Retrieval Augmented Generation in ChatCompletion request. Once enabled, the endpoint will be identified as GenAI endpoint and Arthedain router will be used.

## `labels`{% #labels %}

**Type**: `UNORDERED_LIST_STRING` 

## `model_deployment_monitoring_job`{% #model_deployment_monitoring_job %}

**Type**: `STRING` **Provider name**: `modelDeploymentMonitoringJob` **Description**: Output only. Resource name of the Model Monitoring job associated with this Endpoint if monitoring is enabled by JobService.CreateModelDeploymentMonitoringJob. Format: `projects/{project}/locations/{location}/modelDeploymentMonitoringJobs/{model_deployment_monitoring_job}` 

## `name`{% #name %}

**Type**: `STRING` **Provider name**: `name` **Description**: Output only. The resource name of the Endpoint. 

## `network`{% #network %}

**Type**: `STRING` **Provider name**: `network` **Description**: Optional. The full name of the Google Compute Engine [network](https://cloud.google.com//compute/docs/networks-and-firewalls#networks) to which the Endpoint should be peered. Private services access must already be configured for the network. If left unspecified, the Endpoint is not peered with any network. Only one of the fields, network or enable_private_service_connect, can be set. [Format](https://cloud.google.com/compute/docs/reference/rest/v1/networks/insert): `projects/{project}/global/networks/{network}`. Where `{project}` is a project number, as in `12345`, and `{network}` is network name. 

## `organization_id`{% #organization_id %}

**Type**: `STRING` 

## `parent`{% #parent %}

**Type**: `STRING` 

## `predict_request_response_logging_config`{% #predict_request_response_logging_config %}

**Type**: `STRUCT` **Provider name**: `predictRequestResponseLoggingConfig` **Description**: Configures the request-response logging for online prediction. 

- `bigquery_destination`**Type**: `STRUCT`**Provider name**: `bigqueryDestination`**Description**: BigQuery table for logging. If only given a project, a new dataset will be created with name `logging__` where will be made BigQuery-dataset-name compatible (e.g. most special characters will become underscores). If no table name is given, a new table will be created with name `request_response_logging`
  - `output_uri`**Type**: `STRING`**Provider name**: `outputUri`**Description**: Required. BigQuery URI to a project or table, up to 2000 characters long. When only the project is specified, the Dataset and Table is created. When the full table reference is specified, the Dataset must exist and table must not exist. Accepted forms: * BigQuery path. For example: `bq://projectId` or `bq://projectId.bqDatasetId` or `bq://projectId.bqDatasetId.bqTableId`.
- `enabled`**Type**: `BOOLEAN`**Provider name**: `enabled`**Description**: If logging is enabled or not.
- `sampling_rate`**Type**: `DOUBLE`**Provider name**: `samplingRate`**Description**: Percentage of requests to be logged, expressed as a fraction in range(0,1].

## `private_service_connect_config`{% #private_service_connect_config %}

**Type**: `STRUCT` **Provider name**: `privateServiceConnectConfig` **Description**: Optional. Configuration for private service connect. network and private_service_connect_config are mutually exclusive. 

- `enable_private_service_connect`**Type**: `BOOLEAN`**Provider name**: `enablePrivateServiceConnect`**Description**: Required. If true, expose the IndexEndpoint via private service connect.
- `project_allowlist`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `projectAllowlist`**Description**: A list of Projects from which the forwarding rule will target the service attachment.
- `service_attachment`**Type**: `STRING`**Provider name**: `serviceAttachment`**Description**: Output only. The name of the generated service attachment resource. This is only populated if the endpoint is deployed with PrivateServiceConnect.

## `project_id`{% #project_id %}

**Type**: `STRING` 

## `project_number`{% #project_number %}

**Type**: `STRING` 

## `region_id`{% #region_id %}

**Type**: `STRING` 

## `resource_name`{% #resource_name %}

**Type**: `STRING` 

## `satisfies_pzi`{% #satisfies_pzi %}

**Type**: `BOOLEAN` **Provider name**: `satisfiesPzi` **Description**: Output only. Reserved for future use. 

## `satisfies_pzs`{% #satisfies_pzs %}

**Type**: `BOOLEAN` **Provider name**: `satisfiesPzs` **Description**: Output only. Reserved for future use. 

## `tags`{% #tags %}

**Type**: `UNORDERED_LIST_STRING` 

## `update_time`{% #update_time %}

**Type**: `TIMESTAMP` **Provider name**: `updateTime` **Description**: Output only. Timestamp when this Endpoint was last updated. 

## `zone_id`{% #zone_id %}

**Type**: `STRING`