---
title: Getting Started with Datadog
description: Datadog, the leading service for cloud-scale monitoring.
breadcrumbs: Docs > Infrastructure > Datadog Resource Catalog
---

# gcp_aiplatform_custom_job{% #gcp_aiplatform_custom_job %}

## `ancestors`{% #ancestors %}

**Type**: `UNORDERED_LIST_STRING`

## `create_time`{% #create_time %}

**Type**: `TIMESTAMP`**Provider name**: `createTime`**Description**: Output only. Time when the CustomJob was created.

## `encryption_spec`{% #encryption_spec %}

**Type**: `STRUCT`**Provider name**: `encryptionSpec`**Description**: Customer-managed encryption key options for a CustomJob. If this is set, then all resources created by the CustomJob will be encrypted with the provided encryption key.

- `kms_key_name`**Type**: `STRING`**Provider name**: `kmsKeyName`**Description**: Required. The Cloud KMS resource identifier of the customer managed encryption key used to protect a resource. Has the form: `projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key`. The key needs to be in the same region as where the compute resource is created.

## `end_time`{% #end_time %}

**Type**: `TIMESTAMP`**Provider name**: `endTime`**Description**: Output only. Time when the CustomJob entered any of the following states: `JOB_STATE_SUCCEEDED`, `JOB_STATE_FAILED`, `JOB_STATE_CANCELLED`.

## `error`{% #error %}

**Type**: `STRUCT`**Provider name**: `error`**Description**: Output only. Only populated when job's state is `JOB_STATE_FAILED` or `JOB_STATE_CANCELLED`.

- `code`**Type**: `INT32`**Provider name**: `code`**Description**: The status code, which should be an enum value of google.rpc.Code.
- `message`**Type**: `STRING`**Provider name**: `message`**Description**: A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client.

## `gcp_display_name`{% #gcp_display_name %}

**Type**: `STRING`**Provider name**: `displayName`**Description**: Required. The display name of the CustomJob. The name can be up to 128 characters long and can consist of any UTF-8 characters.

## `job_spec`{% #job_spec %}

**Type**: `STRUCT`**Provider name**: `jobSpec`**Description**: Required. Job spec.

- `base_output_directory`**Type**: `STRUCT`**Provider name**: `baseOutputDirectory`**Description**: The Cloud Storage location to store the output of this CustomJob or HyperparameterTuningJob. For HyperparameterTuningJob, the baseOutputDirectory of each child CustomJob backing a Trial is set to a subdirectory of name id under its parent HyperparameterTuningJob's baseOutputDirectory. The following Vertex AI environment variables will be passed to containers or python modules when this field is set: For CustomJob: * AIP_MODEL_DIR = `/model/` * AIP_CHECKPOINT_DIR = `/checkpoints/` * AIP_TENSORBOARD_LOG_DIR = `/logs/` For CustomJob backing a Trial of HyperparameterTuningJob: * AIP_MODEL_DIR = `//model/` * AIP_CHECKPOINT_DIR = `//checkpoints/` * AIP_TENSORBOARD_LOG_DIR = `//logs/`
  - `output_uri_prefix`**Type**: `STRING`**Provider name**: `outputUriPrefix`**Description**: Required. Google Cloud Storage URI to output directory. If the uri doesn't end with '/', a '/' will be automatically appended. The directory is created if it doesn't exist.
- `enable_dashboard_access`**Type**: `BOOLEAN`**Provider name**: `enableDashboardAccess`**Description**: Optional. Whether you want Vertex AI to enable access to the customized dashboard in training chief container. If set to `true`, you can access the dashboard at the URIs given by CustomJob.web_access_uris or Trial.web_access_uris (within HyperparameterTuningJob.trials).
- `enable_web_access`**Type**: `BOOLEAN`**Provider name**: `enableWebAccess`**Description**: Optional. Whether you want Vertex AI to enable [interactive shell access](https://cloud.google.com/vertex-ai/docs/training/monitor-debug-interactive-shell) to training containers. If set to `true`, you can access interactive shells at the URIs given by CustomJob.web_access_uris or Trial.web_access_uris (within HyperparameterTuningJob.trials).
- `experiment`**Type**: `STRING`**Provider name**: `experiment`**Description**: Optional. The Experiment associated with this job. Format: `projects/{project}/locations/{location}/metadataStores/{metadataStores}/contexts/{experiment-name}`
- `experiment_run`**Type**: `STRING`**Provider name**: `experimentRun`**Description**: Optional. The Experiment Run associated with this job. Format: `projects/{project}/locations/{location}/metadataStores/{metadataStores}/contexts/{experiment-name}-{experiment-run-name}`
- `models`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `models`**Description**: Optional. The name of the Model resources for which to generate a mapping to artifact URIs. Applicable only to some of the Google-provided custom jobs. Format: `projects/{project}/locations/{location}/models/{model}` In order to retrieve a specific version of the model, also provide the version ID or version alias. Example: `projects/{project}/locations/{location}/models/{model}@2` or `projects/{project}/locations/{location}/models/{model}@golden` If no version ID or alias is specified, the "default" version will be returned. The "default" version alias is created for the first version of the model, and can be moved to other versions later on. There will be exactly one default version.
- `network`**Type**: `STRING`**Provider name**: `network`**Description**: Optional. The full name of the Compute Engine [network](https://docs.datadoghq.com/compute/docs/networks-and-firewalls#networks) to which the Job should be peered. For example, `projects/12345/global/networks/myVPC`. [Format](https://docs.datadoghq.com/compute/docs/reference/rest/v1/networks/insert) is of the form `projects/{project}/global/networks/{network}`. Where {project} is a project number, as in `12345`, and {network} is a network name. To specify this field, you must have already [configured VPC Network Peering for Vertex AI](https://cloud.google.com/vertex-ai/docs/general/vpc-peering). If this field is left unspecified, the job is not peered with any network.
- `persistent_resource_id`**Type**: `STRING`**Provider name**: `persistentResourceId`**Description**: Optional. The ID of the PersistentResource in the same Project and Location which to run If this is specified, the job will be run on existing machines held by the PersistentResource instead of on-demand short-live machines. The network and CMEK configs on the job should be consistent with those on the PersistentResource, otherwise, the job will be rejected.
- `protected_artifact_location_id`**Type**: `STRING`**Provider name**: `protectedArtifactLocationId`**Description**: The ID of the location to store protected artifacts. e.g. us-central1. Populate only when the location is different than CustomJob location. List of supported locations: [https://cloud.google.com/vertex-ai/docs/general/locations](https://cloud.google.com/vertex-ai/docs/general/locations)
- `reserved_ip_ranges`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `reservedIpRanges`**Description**: Optional. A list of names for the reserved ip ranges under the VPC network that can be used for this job. If set, we will deploy the job within the provided ip ranges. Otherwise, the job will be deployed to any ip ranges under the provided VPC network. Example: ['vertex-ai-ip-range'].
- `scheduling`**Type**: `STRUCT`**Provider name**: `scheduling`**Description**: Scheduling options for a CustomJob.
  - `disable_retries`**Type**: `BOOLEAN`**Provider name**: `disableRetries`**Description**: Optional. Indicates if the job should retry for internal errors after the job starts running. If true, overrides `Scheduling.restart_job_on_worker_restart` to false.
  - `max_wait_duration`**Type**: `STRING`**Provider name**: `maxWaitDuration`**Description**: Optional. This is the maximum duration that a job will wait for the requested resources to be provisioned if the scheduling strategy is set to [Strategy.DWS_FLEX_START]. If set to 0, the job will wait indefinitely. The default is 24 hours.
  - `restart_job_on_worker_restart`**Type**: `BOOLEAN`**Provider name**: `restartJobOnWorkerRestart`**Description**: Optional. Restarts the entire CustomJob if a worker gets restarted. This feature can be used by distributed training jobs that are not resilient to workers leaving and joining a job.
  - `strategy`**Type**: `STRING`**Provider name**: `strategy`**Description**: Optional. This determines which type of scheduling strategy to use.**Possible values**:
    - `STRATEGY_UNSPECIFIED` - Strategy will default to STANDARD.
    - `ON_DEMAND` - Deprecated. Regular on-demand provisioning strategy.
    - `LOW_COST` - Deprecated. Low cost by making potential use of spot resources.
    - `STANDARD` - Standard provisioning strategy uses regular on-demand resources.
    - `SPOT` - Spot provisioning strategy uses spot resources.
    - `FLEX_START` - Flex Start strategy uses DWS to queue for resources.
  - `timeout`**Type**: `STRING`**Provider name**: `timeout`**Description**: Optional. The maximum job running time. The default is 7 days.
- `service_account`**Type**: `STRING`**Provider name**: `serviceAccount`**Description**: Specifies the service account for workload run-as account. Users submitting jobs must have act-as permission on this run-as account. If unspecified, the [Vertex AI Custom Code Service Agent](https://cloud.google.com/vertex-ai/docs/general/access-control#service-agents) for the CustomJob's project is used.
- `tensorboard`**Type**: `STRING`**Provider name**: `tensorboard`**Description**: Optional. The name of a Vertex AI Tensorboard resource to which this CustomJob will upload Tensorboard logs. Format: `projects/{project}/locations/{location}/tensorboards/{tensorboard}`
- `worker_pool_specs`**Type**: `UNORDERED_LIST_STRUCT`**Provider name**: `workerPoolSpecs`**Description**: Required. The spec of the worker pools including machine type and Docker image. All worker pools except the first one are optional and can be skipped by providing an empty value.
  - `container_spec`**Type**: `STRUCT`**Provider name**: `containerSpec`**Description**: The custom container task.
    - `args`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `args`**Description**: The arguments to be passed when starting the container.
    - `command`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `command`**Description**: The command to be invoked when the container is started. It overrides the entrypoint instruction in Dockerfile when provided.
    - `env`**Type**: `UNORDERED_LIST_STRUCT`**Provider name**: `env`**Description**: Environment variables to be passed to the container. Maximum limit is 100.
      - `name`**Type**: `STRING`**Provider name**: `name`**Description**: Required. Name of the environment variable. Must be a valid C identifier.
      - `value`**Type**: `STRING`**Provider name**: `value`**Description**: Required. Variables that reference a $(VAR_NAME) are expanded using the previous defined environment variables in the container and any service environment variables. If a variable cannot be resolved, the reference in the input string will be unchanged. The $(VAR_NAME) syntax can be escaped with a double $$, ie: $$(VAR_NAME). Escaped references will never be expanded, regardless of whether the variable exists or not.
    - `image_uri`**Type**: `STRING`**Provider name**: `imageUri`**Description**: Required. The URI of a container image in the Container Registry that is to be run on each worker replica.
  - `disk_spec`**Type**: `STRUCT`**Provider name**: `diskSpec`**Description**: Disk spec.
    - `boot_disk_size_gb`**Type**: `INT32`**Provider name**: `bootDiskSizeGb`**Description**: Size in GB of the boot disk (default is 100GB).
    - `boot_disk_type`**Type**: `STRING`**Provider name**: `bootDiskType`**Description**: Type of the boot disk. For non-A3U machines, the default value is "pd-ssd", for A3U machines, the default value is "hyperdisk-balanced". Valid values: "pd-ssd" (Persistent Disk Solid State Drive), "pd-standard" (Persistent Disk Hard Disk Drive) or "hyperdisk-balanced".
  - `machine_spec`**Type**: `STRUCT`**Provider name**: `machineSpec`**Description**: Optional. Immutable. The specification of a single machine.
    - `accelerator_count`**Type**: `INT32`**Provider name**: `acceleratorCount`**Description**: The number of accelerators to attach to the machine.
    - `accelerator_type`**Type**: `STRING`**Provider name**: `acceleratorType`**Description**: Immutable. The type of accelerator(s) that may be attached to the machine as per accelerator_count.**Possible values**:
      - `ACCELERATOR_TYPE_UNSPECIFIED` - Unspecified accelerator type, which means no accelerator.
      - `NVIDIA_TESLA_K80` - Deprecated: Nvidia Tesla K80 GPU has reached end of support, see [https://cloud.google.com/compute/docs/eol/k80-eol](https://cloud.google.com/compute/docs/eol/k80-eol).
      - `NVIDIA_TESLA_P100` - Nvidia Tesla P100 GPU.
      - `NVIDIA_TESLA_V100` - Nvidia Tesla V100 GPU.
      - `NVIDIA_TESLA_P4` - Nvidia Tesla P4 GPU.
      - `NVIDIA_TESLA_T4` - Nvidia Tesla T4 GPU.
      - `NVIDIA_TESLA_A100` - Nvidia Tesla A100 GPU.
      - `NVIDIA_A100_80GB` - Nvidia A100 80GB GPU.
      - `NVIDIA_L4` - Nvidia L4 GPU.
      - `NVIDIA_H100_80GB` - Nvidia H100 80Gb GPU.
      - `NVIDIA_H100_MEGA_80GB` - Nvidia H100 Mega 80Gb GPU.
      - `NVIDIA_H200_141GB` - Nvidia H200 141Gb GPU.
      - `TPU_V2` - TPU v2.
      - `TPU_V3` - TPU v3.
      - `TPU_V4_POD` - TPU v4.
      - `TPU_V5_LITEPOD` - TPU v5.
    - `machine_type`**Type**: `STRING`**Provider name**: `machineType`**Description**: Immutable. The type of the machine. See the [list of machine types supported for prediction](https://cloud.google.com/vertex-ai/docs/predictions/configure-compute#machine-types) See the [list of machine types supported for custom training](https://cloud.google.com/vertex-ai/docs/training/configure-compute#machine-types). For DeployedModel this field is optional, and the default value is `n1-standard-2`. For BatchPredictionJob or as part of WorkerPoolSpec this field is required.
    - `reservation_affinity`**Type**: `STRUCT`**Provider name**: `reservationAffinity`**Description**: Optional. Immutable. Configuration controlling how this resource pool consumes reservation.
      - `key`**Type**: `STRING`**Provider name**: `key`**Description**: Optional. Corresponds to the label key of a reservation resource. To target a SPECIFIC_RESERVATION by name, use `compute.googleapis.com/reservation-name` as the key and specify the name of your reservation as its value.
      - `reservation_affinity_type`**Type**: `STRING`**Provider name**: `reservationAffinityType`**Description**: Required. Specifies the reservation affinity type.**Possible values**:
        - `TYPE_UNSPECIFIED` - Default value. This should not be used.
        - `NO_RESERVATION` - Do not consume from any reserved capacity, only use on-demand.
        - `ANY_RESERVATION` - Consume any reservation available, falling back to on-demand.
        - `SPECIFIC_RESERVATION` - Consume from a specific reservation. When chosen, the reservation must be identified via the `key` and `values` fields.
      - `values`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `values`**Description**: Optional. Corresponds to the label values of a reservation resource. This must be the full resource name of the reservation or reservation block.
    - `tpu_topology`**Type**: `STRING`**Provider name**: `tpuTopology`**Description**: Immutable. The topology of the TPUs. Corresponds to the TPU topologies available from GKE. (Example: tpu_topology: "2x2x1").
  - `nfs_mounts`**Type**: `UNORDERED_LIST_STRUCT`**Provider name**: `nfsMounts`**Description**: Optional. List of NFS mount spec.
    - `mount_point`**Type**: `STRING`**Provider name**: `mountPoint`**Description**: Required. Destination mount path. The NFS will be mounted for the user under /mnt/nfs/
    - `path`**Type**: `STRING`**Provider name**: `path`**Description**: Required. Source path exported from NFS server. Has to start with '/', and combined with the ip address, it indicates the source mount path in the form of `server:path`
    - `server`**Type**: `STRING`**Provider name**: `server`**Description**: Required. IP address of the NFS server.
  - `python_package_spec`**Type**: `STRUCT`**Provider name**: `pythonPackageSpec`**Description**: The Python packaged task.
    - `args`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `args`**Description**: Command line arguments to be passed to the Python task.
    - `env`**Type**: `UNORDERED_LIST_STRUCT`**Provider name**: `env`**Description**: Environment variables to be passed to the python module. Maximum limit is 100.
      - `name`**Type**: `STRING`**Provider name**: `name`**Description**: Required. Name of the environment variable. Must be a valid C identifier.
      - `value`**Type**: `STRING`**Provider name**: `value`**Description**: Required. Variables that reference a $(VAR_NAME) are expanded using the previous defined environment variables in the container and any service environment variables. If a variable cannot be resolved, the reference in the input string will be unchanged. The $(VAR_NAME) syntax can be escaped with a double $$, ie: $$(VAR_NAME). Escaped references will never be expanded, regardless of whether the variable exists or not.
    - `executor_image_uri`**Type**: `STRING`**Provider name**: `executorImageUri`**Description**: Required. The URI of a container image in Artifact Registry that will run the provided Python package. Vertex AI provides a wide range of executor images with pre-installed packages to meet users' various use cases. See the list of [pre-built containers for training](https://cloud.google.com/vertex-ai/docs/training/pre-built-containers). You must use an image from this list.
    - `package_uris`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `packageUris`**Description**: Required. The Google Cloud Storage location of the Python package files which are the training program and its dependent packages. The maximum number of package URIs is 100.
    - `python_module`**Type**: `STRING`**Provider name**: `pythonModule`**Description**: Required. The Python module name to run after installing the packages.
  - `replica_count`**Type**: `INT64`**Provider name**: `replicaCount`**Description**: Optional. The number of worker replicas to use for this worker pool.

## `labels`{% #labels %}

**Type**: `UNORDERED_LIST_STRING`

## `name`{% #name %}

**Type**: `STRING`**Provider name**: `name`**Description**: Output only. Resource name of a CustomJob.

## `organization_id`{% #organization_id %}

**Type**: `STRING`

## `parent`{% #parent %}

**Type**: `STRING`

## `project_id`{% #project_id %}

**Type**: `STRING`

## `project_number`{% #project_number %}

**Type**: `STRING`

## `region_id`{% #region_id %}

**Type**: `STRING`

## `resource_name`{% #resource_name %}

**Type**: `STRING`

## `satisfies_pzi`{% #satisfies_pzi %}

**Type**: `BOOLEAN`**Provider name**: `satisfiesPzi`**Description**: Output only. Reserved for future use.

## `satisfies_pzs`{% #satisfies_pzs %}

**Type**: `BOOLEAN`**Provider name**: `satisfiesPzs`**Description**: Output only. Reserved for future use.

## `start_time`{% #start_time %}

**Type**: `TIMESTAMP`**Provider name**: `startTime`**Description**: Output only. Time when the CustomJob for the first time entered the `JOB_STATE_RUNNING` state.

## `state`{% #state %}

**Type**: `STRING`**Provider name**: `state`**Description**: Output only. The detailed state of the job.**Possible values**:

- `JOB_STATE_UNSPECIFIED` - The job state is unspecified.
- `JOB_STATE_QUEUED` - The job has been just created or resumed and processing has not yet begun.
- `JOB_STATE_PENDING` - The service is preparing to run the job.
- `JOB_STATE_RUNNING` - The job is in progress.
- `JOB_STATE_SUCCEEDED` - The job completed successfully.
- `JOB_STATE_FAILED` - The job failed.
- `JOB_STATE_CANCELLING` - The job is being cancelled. From this state the job may only go to either `JOB_STATE_SUCCEEDED`, `JOB_STATE_FAILED` or `JOB_STATE_CANCELLED`.
- `JOB_STATE_CANCELLED` - The job has been cancelled.
- `JOB_STATE_PAUSED` - The job has been stopped, and can be resumed.
- `JOB_STATE_EXPIRED` - The job has expired.
- `JOB_STATE_UPDATING` - The job is being updated. Only jobs in the `RUNNING` state can be updated. After updating, the job goes back to the `RUNNING` state.
- `JOB_STATE_PARTIALLY_SUCCEEDED` - The job is partially succeeded, some results may be missing due to errors.

## `tags`{% #tags %}

**Type**: `UNORDERED_LIST_STRING`

## `update_time`{% #update_time %}

**Type**: `TIMESTAMP`**Provider name**: `updateTime`**Description**: Output only. Time when the CustomJob was most recently updated.

## `zone_id`{% #zone_id %}

**Type**: `STRING`
