SageMaker Processing Job

An AWS SageMaker Processing Job is a managed resource that lets you run data processing and model evaluation workloads at scale. It provides a fully managed environment to preprocess data, perform feature engineering, evaluate models, or run custom scripts using containerized code. The job runs on specified compute instances, automatically handles resource provisioning, and stores outputs in Amazon S3.

aws.sagemaker_processing_job

Fields

TitleIDTypeData TypeDescription
_keycorestring
account_idcorestring
app_specificationcorejsonConfigures the processing job to run a specified container image.
auto_ml_job_arncorestringThe ARN of an AutoML job associated with this processing job.
creation_timecoretimestampThe time at which the processing job was created.
environmentcorehstoreThe environment variables set in the Docker container.
exit_messagecorestringAn optional string, up to one KB in size, that contains metadata from the processing container when the processing job exits.
experiment_configcorejsonThe configuration information used to create an experiment.
failure_reasoncorestringA string, up to one KB in size, that contains the reason a processing job failed, if it failed.
last_modified_timecoretimestampThe time at which the processing job was last modified.
monitoring_schedule_arncorestringThe ARN of a monitoring schedule for an endpoint associated with this processing job.
network_configcorejsonNetworking options for a processing job.
processing_end_timecoretimestampThe time at which the processing job completed.
processing_inputscorejsonThe inputs for a processing job.
processing_job_arncorestringThe Amazon Resource Name (ARN) of the processing job.
processing_job_namecorestringThe name of the processing job. The name must be unique within an Amazon Web Services Region in the Amazon Web Services account.
processing_job_statuscorestringProvides the status of a processing job.
processing_output_configcorejsonOutput configuration for the processing job.
processing_resourcescorejsonIdentifies the resources, ML compute instances, and ML storage volumes to deploy for a processing job. In distributed training, you specify more than one instance.
processing_start_timecoretimestampThe time at which the processing job started.
role_arncorestringThe Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.
stopping_conditioncorejsonThe time limit for how long the processing job is allowed to run.
tagscorehstore
training_job_arncorestringThe ARN of a training job associated with this processing job.