---
title: Getting Started with Datadog
description: Datadog, the leading service for cloud-scale monitoring.
breadcrumbs: Docs > Infrastructure > Datadog Resource Catalog
---

# aws_sagemaker_processingjob{% #aws_sagemaker_processingjob %}

## `account_id`{% #account_id %}

**Type**: `STRING`

## `app_specification`{% #app_specification %}

**Type**: `STRUCT`**Provider name**: `AppSpecification`**Description**: Configures the processing job to run a specified container image.

- `container_arguments`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `ContainerArguments`**Description**: The arguments for a container used to run a processing job.
- `container_entrypoint`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `ContainerEntrypoint`**Description**: The entrypoint for a container used to run a processing job.
- `image_uri`**Type**: `STRING`**Provider name**: `ImageUri`**Description**: The container image to be run by the processing job.

## `auto_ml_job_arn`{% #auto_ml_job_arn %}

**Type**: `STRING`**Provider name**: `AutoMLJobArn`**Description**: The ARN of an AutoML job associated with this processing job.

## `creation_time`{% #creation_time %}

**Type**: `TIMESTAMP`**Provider name**: `CreationTime`**Description**: The time at which the processing job was created.

## `environment`{% #environment %}

**Type**: `MAP_STRING_STRING`**Provider name**: `Environment`**Description**: The environment variables set in the Docker container.

## `exit_message`{% #exit_message %}

**Type**: `STRING`**Provider name**: `ExitMessage`**Description**: An optional string, up to one KB in size, that contains metadata from the processing container when the processing job exits.

## `experiment_config`{% #experiment_config %}

**Type**: `STRUCT`**Provider name**: `ExperimentConfig`**Description**: The configuration information used to create an experiment.

- `experiment_name`**Type**: `STRING`**Provider name**: `ExperimentName`**Description**: The name of an existing experiment to associate with the trial component.
- `run_name`**Type**: `STRING`**Provider name**: `RunName`**Description**: The name of the experiment run to associate with the trial component.
- `trial_component_display_name`**Type**: `STRING`**Provider name**: `TrialComponentDisplayName`**Description**: The display name for the trial component. If this key isn't specified, the display name is the trial component name.
- `trial_name`**Type**: `STRING`**Provider name**: `TrialName`**Description**: The name of an existing trial to associate the trial component with. If not specified, a new trial is created.

## `failure_reason`{% #failure_reason %}

**Type**: `STRING`**Provider name**: `FailureReason`**Description**: A string, up to one KB in size, that contains the reason a processing job failed, if it failed.

## `last_modified_time`{% #last_modified_time %}

**Type**: `TIMESTAMP`**Provider name**: `LastModifiedTime`**Description**: The time at which the processing job was last modified.

## `monitoring_schedule_arn`{% #monitoring_schedule_arn %}

**Type**: `STRING`**Provider name**: `MonitoringScheduleArn`**Description**: The ARN of a monitoring schedule for an endpoint associated with this processing job.

## `network_config`{% #network_config %}

**Type**: `STRUCT`**Provider name**: `NetworkConfig`**Description**: Networking options for a processing job.

- `enable_inter_container_traffic_encryption`**Type**: `BOOLEAN`**Provider name**: `EnableInterContainerTrafficEncryption`**Description**: Whether to encrypt all communications between distributed processing jobs. Choose `True` to encrypt communications. Encryption provides greater security for distributed processing jobs, but the processing might take longer.
- `enable_network_isolation`**Type**: `BOOLEAN`**Provider name**: `EnableNetworkIsolation`**Description**: Whether to allow inbound and outbound network calls to and from the containers used for the processing job.
- `vpc_config`**Type**: `STRUCT`**Provider name**: `VpcConfig`
  - `security_group_ids`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `SecurityGroupIds`**Description**: The VPC security group IDs, in the form `sg-xxxxxxxx`. Specify the security groups for the VPC that is specified in the `Subnets` field.
  - `subnets`**Type**: `UNORDERED_LIST_STRING`**Provider name**: `Subnets`**Description**: The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see [Supported Instance Types and Availability Zones](https://docs.aws.amazon.com/sagemaker/latest/dg/instance-types-az.html).

## `processing_end_time`{% #processing_end_time %}

**Type**: `TIMESTAMP`**Provider name**: `ProcessingEndTime`**Description**: The time at which the processing job completed.

## `processing_inputs`{% #processing_inputs %}

**Type**: `UNORDERED_LIST_STRUCT`**Provider name**: `ProcessingInputs`**Description**: The inputs for a processing job.

- `app_managed`**Type**: `BOOLEAN`**Provider name**: `AppManaged`**Description**: When `True`, input operations such as data download are managed natively by the processing job application. When `False` (default), input operations are managed by Amazon SageMaker.
- `dataset_definition`**Type**: `STRUCT`**Provider name**: `DatasetDefinition`**Description**: Configuration for a Dataset Definition input.
  - `athena_dataset_definition`**Type**: `STRUCT`**Provider name**: `AthenaDatasetDefinition`
    - `catalog`**Type**: `STRING`**Provider name**: `Catalog`
    - `database`**Type**: `STRING`**Provider name**: `Database`
    - `kms_key_id`**Type**: `STRING`**Provider name**: `KmsKeyId`**Description**: The Amazon Web Services Key Management Service (Amazon Web Services KMS) key that Amazon SageMaker uses to encrypt data generated from an Athena query execution.
    - `output_compression`**Type**: `STRING`**Provider name**: `OutputCompression`
    - `output_format`**Type**: `STRING`**Provider name**: `OutputFormat`
    - `output_s3_uri`**Type**: `STRING`**Provider name**: `OutputS3Uri`**Description**: The location in Amazon S3 where Athena query results are stored.
    - `query_string`**Type**: `STRING`**Provider name**: `QueryString`
    - `work_group`**Type**: `STRING`**Provider name**: `WorkGroup`
  - `data_distribution_type`**Type**: `STRING`**Provider name**: `DataDistributionType`**Description**: Whether the generated dataset is `FullyReplicated` or `ShardedByS3Key` (default).
  - `input_mode`**Type**: `STRING`**Provider name**: `InputMode`**Description**: Whether to use `File` or `Pipe` input mode. In `File` (default) mode, Amazon SageMaker copies the data from the input source onto the local Amazon Elastic Block Store (Amazon EBS) volumes before starting your training algorithm. This is the most commonly used input mode. In `Pipe` mode, Amazon SageMaker streams input data from the source directly to your algorithm without using the EBS volume.
  - `local_path`**Type**: `STRING`**Provider name**: `LocalPath`**Description**: The local path where you want Amazon SageMaker to download the Dataset Definition inputs to run a processing job. `LocalPath` is an absolute path to the input data. This is a required parameter when `AppManaged` is `False` (default).
  - `redshift_dataset_definition`**Type**: `STRUCT`**Provider name**: `RedshiftDatasetDefinition`
    - `cluster_id`**Type**: `STRING`**Provider name**: `ClusterId`
    - `cluster_role_arn`**Type**: `STRING`**Provider name**: `ClusterRoleArn`**Description**: The IAM role attached to your Redshift cluster that Amazon SageMaker uses to generate datasets.
    - `database`**Type**: `STRING`**Provider name**: `Database`
    - `db_user`**Type**: `STRING`**Provider name**: `DbUser`
    - `kms_key_id`**Type**: `STRING`**Provider name**: `KmsKeyId`**Description**: The Amazon Web Services Key Management Service (Amazon Web Services KMS) key that Amazon SageMaker uses to encrypt data from a Redshift execution.
    - `output_compression`**Type**: `STRING`**Provider name**: `OutputCompression`
    - `output_format`**Type**: `STRING`**Provider name**: `OutputFormat`
    - `output_s3_uri`**Type**: `STRING`**Provider name**: `OutputS3Uri`**Description**: The location in Amazon S3 where the Redshift query results are stored.
    - `query_string`**Type**: `STRING`**Provider name**: `QueryString`
- `input_name`**Type**: `STRING`**Provider name**: `InputName`**Description**: The name for the processing job input.
- `s3_input`**Type**: `STRUCT`**Provider name**: `S3Input`**Description**: Configuration for downloading input data from Amazon S3 into the processing container.
  - `local_path`**Type**: `STRING`**Provider name**: `LocalPath`**Description**: The local path in your container where you want Amazon SageMaker to write input data to. `LocalPath` is an absolute path to the input data and must begin with `/opt/ml/processing/`. `LocalPath` is a required parameter when `AppManaged` is `False` (default).
  - `s3_compression_type`**Type**: `STRING`**Provider name**: `S3CompressionType`**Description**: Whether to GZIP-decompress the data in Amazon S3 as it is streamed into the processing container. `Gzip` can only be used when `Pipe` mode is specified as the `S3InputMode`. In `Pipe` mode, Amazon SageMaker streams input data from the source directly to your container without using the EBS volume.
  - `s3_data_distribution_type`**Type**: `STRING`**Provider name**: `S3DataDistributionType`**Description**: Whether to distribute the data from Amazon S3 to all processing instances with `FullyReplicated`, or whether the data from Amazon S3 is shared by Amazon S3 key, downloading one shard of data to each processing instance.
  - `s3_data_type`**Type**: `STRING`**Provider name**: `S3DataType`**Description**: Whether you use an `S3Prefix` or a `ManifestFile` for the data type. If you choose `S3Prefix`, `S3Uri` identifies a key name prefix. Amazon SageMaker uses all objects with the specified key name prefix for the processing job. If you choose `ManifestFile`, `S3Uri` identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for the processing job.
  - `s3_input_mode`**Type**: `STRING`**Provider name**: `S3InputMode`**Description**: Whether to use `File` or `Pipe` input mode. In File mode, Amazon SageMaker copies the data from the input source onto the local ML storage volume before starting your processing container. This is the most commonly used input mode. In `Pipe` mode, Amazon SageMaker streams input data from the source directly to your processing container into named pipes without using the ML storage volume.
  - `s3_uri`**Type**: `STRING`**Provider name**: `S3Uri`**Description**: The URI of the Amazon S3 prefix Amazon SageMaker downloads data required to run a processing job.

## `processing_job_arn`{% #processing_job_arn %}

**Type**: `STRING`**Provider name**: `ProcessingJobArn`**Description**: The Amazon Resource Name (ARN) of the processing job.

## `processing_job_name`{% #processing_job_name %}

**Type**: `STRING`**Provider name**: `ProcessingJobName`**Description**: The name of the processing job. The name must be unique within an Amazon Web Services Region in the Amazon Web Services account.

## `processing_job_status`{% #processing_job_status %}

**Type**: `STRING`**Provider name**: `ProcessingJobStatus`**Description**: Provides the status of a processing job.

## `processing_output_config`{% #processing_output_config %}

**Type**: `STRUCT`**Provider name**: `ProcessingOutputConfig`**Description**: Output configuration for the processing job.

- `kms_key_id`**Type**: `STRING`**Provider name**: `KmsKeyId`**Description**: The Amazon Web Services Key Management Service (Amazon Web Services KMS) key that Amazon SageMaker uses to encrypt the processing job output. `KmsKeyId` can be an ID of a KMS key, ARN of a KMS key, or alias of a KMS key. The `KmsKeyId` is applied to all outputs.
- `outputs`**Type**: `UNORDERED_LIST_STRUCT`**Provider name**: `Outputs`**Description**: An array of outputs configuring the data to upload from the processing container.
  - `app_managed`**Type**: `BOOLEAN`**Provider name**: `AppManaged`**Description**: When `True`, output operations such as data upload are managed natively by the processing job application. When `False` (default), output operations are managed by Amazon SageMaker.
  - `feature_store_output`**Type**: `STRUCT`**Provider name**: `FeatureStoreOutput`**Description**: Configuration for processing job outputs in Amazon SageMaker Feature Store. This processing output type is only supported when `AppManaged` is specified.
    - `feature_group_name`**Type**: `STRING`**Provider name**: `FeatureGroupName`**Description**: The name of the Amazon SageMaker FeatureGroup to use as the destination for processing job output. Note that your processing script is responsible for putting records into your Feature Store.
  - `output_name`**Type**: `STRING`**Provider name**: `OutputName`**Description**: The name for the processing job output.
  - `s3_output`**Type**: `STRUCT`**Provider name**: `S3Output`**Description**: Configuration for processing job outputs in Amazon S3.
    - `local_path`**Type**: `STRING`**Provider name**: `LocalPath`**Description**: The local path of a directory where you want Amazon SageMaker to upload its contents to Amazon S3. `LocalPath` is an absolute path to a directory containing output files. This directory will be created by the platform and exist when your container's entrypoint is invoked.
    - `s3_upload_mode`**Type**: `STRING`**Provider name**: `S3UploadMode`**Description**: Whether to upload the results of the processing job continuously or after the job completes.
    - `s3_uri`**Type**: `STRING`**Provider name**: `S3Uri`**Description**: A URI that identifies the Amazon S3 bucket where you want Amazon SageMaker to save the results of a processing job.

## `processing_resources`{% #processing_resources %}

**Type**: `STRUCT`**Provider name**: `ProcessingResources`**Description**: Identifies the resources, ML compute instances, and ML storage volumes to deploy for a processing job. In distributed training, you specify more than one instance.

- `cluster_config`**Type**: `STRUCT`**Provider name**: `ClusterConfig`**Description**: The configuration for the resources in a cluster used to run the processing job.
  - `instance_count`**Type**: `INT32`**Provider name**: `InstanceCount`**Description**: The number of ML compute instances to use in the processing job. For distributed processing jobs, specify a value greater than 1. The default value is 1.
  - `instance_type`**Type**: `STRING`**Provider name**: `InstanceType`**Description**: The ML compute instance type for the processing job.
  - `volume_kms_key_id`**Type**: `STRING`**Provider name**: `VolumeKmsKeyId`**Description**: The Amazon Web Services Key Management Service (Amazon Web Services KMS) key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance(s) that run the processing job.Certain Nitro-based instances include local storage, dependent on the instance type. Local storage volumes are encrypted using a hardware module on the instance. You can't request a `VolumeKmsKeyId` when using an instance type with local storage. For a list of instance types that support local instance storage, see [Instance Store Volumes](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html#instance-store-volumes). For more information about local instance storage encryption, see [SSD Instance Store Volumes](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ssd-instance-store.html).
  - `volume_size_in_gb`**Type**: `INT32`**Provider name**: `VolumeSizeInGB`**Description**: The size of the ML storage volume in gigabytes that you want to provision. You must specify sufficient ML storage for your scenario.Certain Nitro-based instances include local storage with a fixed total size, dependent on the instance type. When using these instances for processing, Amazon SageMaker mounts the local instance storage instead of Amazon EBS gp2 storage. You can't request a `VolumeSizeInGB` greater than the total size of the local instance storage. For a list of instance types that support local instance storage, including the total size per instance type, see [Instance Store Volumes](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html#instance-store-volumes).

## `processing_start_time`{% #processing_start_time %}

**Type**: `TIMESTAMP`**Provider name**: `ProcessingStartTime`**Description**: The time at which the processing job started.

## `role_arn`{% #role_arn %}

**Type**: `STRING`**Provider name**: `RoleArn`**Description**: The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.

## `stopping_condition`{% #stopping_condition %}

**Type**: `STRUCT`**Provider name**: `StoppingCondition`**Description**: The time limit for how long the processing job is allowed to run.

- `max_runtime_in_seconds`**Type**: `INT32`**Provider name**: `MaxRuntimeInSeconds`**Description**: Specifies the maximum runtime in seconds.

## `tags`{% #tags %}

**Type**: `UNORDERED_LIST_STRING`

## `training_job_arn`{% #training_job_arn %}

**Type**: `STRING`**Provider name**: `TrainingJobArn`**Description**: The ARN of a training job associated with this processing job.
