SageMaker Data Quality Job Definition

SageMaker Data Quality Job Definition in AWS defines the configuration for monitoring and evaluating the quality of data used in machine learning workflows. It specifies details such as the dataset, monitoring schedule, resources, and output location for reports. This helps detect issues like missing values, data drift, or anomalies, ensuring that models are trained and evaluated on reliable data.

aws.sagemaker_data_quality_job_definition

Fields

TitleIDTypeData TypeDescription
_keycorestring
account_idcorestring
creation_timecoretimestampThe time that the data quality monitoring job definition was created.
data_quality_app_specificationcorejsonInformation about the container that runs the data quality monitoring job.
data_quality_baseline_configcorejsonThe constraints and baselines for the data quality monitoring job definition.
data_quality_job_inputcorejsonThe list of inputs for the data quality monitoring job. Currently endpoints are supported.
data_quality_job_output_configcorejsonThe output configuration for monitoring jobs.
job_definition_arncorestringThe Amazon Resource Name (ARN) of the data quality monitoring job definition.
job_definition_namecorestringThe name of the data quality monitoring job definition.
job_resourcescorejsonIdentifies the resources to deploy for a monitoring job.
network_configcorejsonThe networking configuration for the data quality monitoring job.
role_arncorestringThe Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker AI can assume to perform tasks on your behalf.
stopping_conditioncorejsonA time limit for how long the monitoring job is allowed to run before stopping.
tagscorehstore