aws_sagemaker_inferencerecommendationjob

`account_id`

Type: STRING

`completion_time`

Type: TIMESTAMP
Provider name: CompletionTime
Description: A timestamp that shows when the job completed.

`creation_time`

Type: TIMESTAMP
Provider name: CreationTime
Description: A timestamp that shows when the job was created.

`endpoint_performances`

Type: UNORDERED_LIST_STRUCT
Provider name: EndpointPerformances
Description: The performance results from running an Inference Recommender job on an existing endpoint.

endpoint_info
Type: STRUCT
Provider name: EndpointInfo
- endpoint_name
  Type: STRING
  Provider name: EndpointName
  Description: The name of a customer’s endpoint.
metrics
Type: STRUCT
Provider name: Metrics
Description: The metrics for an existing endpoint.
- max_invocations
  Type: INT32
  Provider name: MaxInvocations
  Description: The expected maximum number of requests per minute for the instance.
- model_latency
  Type: INT32
  Provider name: ModelLatency
  Description: The expected model latency at maximum invocations per minute for the instance.

`failure_reason`

Type: STRING
Provider name: FailureReason
Description: If the job fails, provides information why the job failed.

`inference_recommendations`

Type: UNORDERED_LIST_STRUCT
Provider name: InferenceRecommendations
Description: The recommendations made by Inference Recommender.

endpoint_configuration
Type: STRUCT
Provider name: EndpointConfiguration
Description: Defines the endpoint configuration parameters.
- endpoint_name
  Type: STRING
  Provider name: EndpointName
  Description: The name of the endpoint made during a recommendation job.
- initial_instance_count
  Type: INT32
  Provider name: InitialInstanceCount
  Description: The number of instances recommended to launch initially.
- instance_type
  Type: STRING
  Provider name: InstanceType
  Description: The instance type recommended by Amazon SageMaker Inference Recommender.
- serverless_config
  Type: STRUCT
  Provider name: ServerlessConfig
  - max_concurrency
    Type: INT32
    Provider name: MaxConcurrency
    Description: The maximum number of concurrent invocations your serverless endpoint can process.
  - memory_size_in_mb
    Type: INT32
    Provider name: MemorySizeInMB
    Description: The memory size of your serverless endpoint. Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.
  - provisioned_concurrency
    Type: INT32
    Provider name: ProvisionedConcurrency
    Description: The amount of provisioned concurrency to allocate for the serverless endpoint. Should be less than or equal to MaxConcurrency. This field is not supported for serverless endpoint recommendations for Inference Recommender jobs. For more information about creating an Inference Recommender job, see CreateInferenceRecommendationsJobs.
- variant_name
  Type: STRING
  Provider name: VariantName
  Description: The name of the production variant (deployed model) made during a recommendation job.
invocation_end_time
Type: TIMESTAMP
Provider name: InvocationEndTime
Description: A timestamp that shows when the benchmark completed.
invocation_start_time
Type: TIMESTAMP
Provider name: InvocationStartTime
Description: A timestamp that shows when the benchmark started.
metrics
Type: STRUCT
Provider name: Metrics
Description: The metrics used to decide what recommendation to make.
- cost_per_hour
  Type: FLOAT
  Provider name: CostPerHour
  Description: Defines the cost per hour for the instance.
- cost_per_inference
  Type: FLOAT
  Provider name: CostPerInference
  Description: Defines the cost per inference for the instance .
- cpu_utilization
  Type: FLOAT
  Provider name: CpuUtilization
  Description: The expected CPU utilization at maximum invocations per minute for the instance. NaN indicates that the value is not available.
- max_invocations
  Type: INT32
  Provider name: MaxInvocations
  Description: The expected maximum number of requests per minute for the instance.
- memory_utilization
  Type: FLOAT
  Provider name: MemoryUtilization
  Description: The expected memory utilization at maximum invocations per minute for the instance. NaN indicates that the value is not available.
- model_latency
  Type: INT32
  Provider name: ModelLatency
  Description: The expected model latency at maximum invocation per minute for the instance.
- model_setup_time
  Type: INT32
  Provider name: ModelSetupTime
  Description: The time it takes to launch new compute resources for a serverless endpoint. The time can vary depending on the model size, how long it takes to download the model, and the start-up time of the container. NaN indicates that the value is not available.
model_configuration
Type: STRUCT
Provider name: ModelConfiguration
Description: Defines the model configuration.
- compilation_job_name
  Type: STRING
  Provider name: CompilationJobName
  Description: The name of the compilation job used to create the recommended model artifacts.
- environment_parameters
  Type: UNORDERED_LIST_STRUCT
  Provider name: EnvironmentParameters
  Description: Defines the environment parameters that includes key, value types, and values.
  - key
    Type: STRING
    Provider name: Key
    Description: The environment key suggested by the Amazon SageMaker Inference Recommender.
  - value
    Type: STRING
    Provider name: Value
    Description: The value suggested by the Amazon SageMaker Inference Recommender.
  - value_type
    Type: STRING
    Provider name: ValueType
    Description: The value type suggested by the Amazon SageMaker Inference Recommender.
- inference_specification_name
  Type: STRING
  Provider name: InferenceSpecificationName
  Description: The inference specification name in the model package version.
recommendation_id
Type: STRING
Provider name: RecommendationId
Description: The recommendation ID which uniquely identifies each recommendation.

`input_config`

Type: STRUCT
Provider name: InputConfig
Description: Returns information about the versioned model package Amazon Resource Name (ARN), the traffic pattern, and endpoint configurations you provided when you initiated the job.

container_config
Type: STRUCT
Provider name: ContainerConfig
Description: Specifies mandatory fields for running an Inference Recommender job. The fields specified in ContainerConfig override the corresponding fields in the model package.
- data_input_config
  Type: STRING
  Provider name: DataInputConfig
  Description: Specifies the name and shape of the expected data inputs for your trained model with a JSON dictionary form. This field is used for optimizing your model using SageMaker Neo. For more information, see DataInputConfig.
- domain
  Type: STRING
  Provider name: Domain
  Description: The machine learning domain of the model and its components. Valid Values: COMPUTER_VISION | NATURAL_LANGUAGE_PROCESSING | MACHINE_LEARNING
- framework
  Type: STRING
  Provider name: Framework
  Description: The machine learning framework of the container image. Valid Values: TENSORFLOW | PYTORCH | XGBOOST | SAGEMAKER-SCIKIT-LEARN
- framework_version
  Type: STRING
  Provider name: FrameworkVersion
  Description: The framework version of the container image.
- nearest_model_name
  Type: STRING
  Provider name: NearestModelName
  Description: The name of a pre-trained machine learning model benchmarked by Amazon SageMaker Inference Recommender that matches your model. Valid Values: efficientnetb7 | unet | xgboost | faster-rcnn-resnet101 | nasnetlarge | vgg16 | inception-v3 | mask-rcnn | sagemaker-scikit-learn | densenet201-gluon | resnet18v2-gluon | xception | densenet201 | yolov4 | resnet152 | bert-base-cased | xceptionV1-keras | resnet50 | retinanet
- payload_config
  Type: STRUCT
  Provider name: PayloadConfig
  Description: Specifies the SamplePayloadUrl and all other sample payload-related fields.
  - sample_payload_url
    Type: STRING
    Provider name: SamplePayloadUrl
    Description: The Amazon Simple Storage Service (Amazon S3) path where the sample payload is stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix).
  - supported_content_types
    Type: UNORDERED_LIST_STRING
    Provider name: SupportedContentTypes
    Description: The supported MIME types for the input data.
- supported_endpoint_type
  Type: STRING
  Provider name: SupportedEndpointType
  Description: The endpoint type to receive recommendations for. By default this is null, and the results of the inference recommendation job return a combined list of both real-time and serverless benchmarks. By specifying a value for this field, you can receive a longer list of benchmarks for the desired endpoint type.
- supported_instance_types
  Type: UNORDERED_LIST_STRING
  Provider name: SupportedInstanceTypes
  Description: A list of the instance types that are used to generate inferences in real-time.
- supported_response_mime_types
  Type: UNORDERED_LIST_STRING
  Provider name: SupportedResponseMIMETypes
  Description: The supported MIME types for the output data.
- task
  Type: STRING
  Provider name: Task
  Description: The machine learning task that the model accomplishes. Valid Values: IMAGE_CLASSIFICATION | OBJECT_DETECTION | TEXT_GENERATION | IMAGE_SEGMENTATION | FILL_MASK | CLASSIFICATION | REGRESSION | OTHER
endpoint_configurations
Type: UNORDERED_LIST_STRUCT
Provider name: EndpointConfigurations
Description: Specifies the endpoint configuration to use for a job.
- environment_parameter_ranges
  Type: STRUCT
  Provider name: EnvironmentParameterRanges
  Description: The parameter you want to benchmark against.
  - categorical_parameter_ranges
    Type: UNORDERED_LIST_STRUCT
    Provider name: CategoricalParameterRanges
    Description: Specified a list of parameters for each category.
    - name
      Type: STRING
      Provider name: Name
      Description: The Name of the environment variable.
    - value
      Type: UNORDERED_LIST_STRING
      Provider name: Value
      Description: The list of values you can pass.
- inference_specification_name
  Type: STRING
  Provider name: InferenceSpecificationName
  Description: The inference specification name in the model package version.
- instance_type
  Type: STRING
  Provider name: InstanceType
  Description: The instance types to use for the load test.
- serverless_config
  Type: STRUCT
  Provider name: ServerlessConfig
  - max_concurrency
    Type: INT32
    Provider name: MaxConcurrency
    Description: The maximum number of concurrent invocations your serverless endpoint can process.
  - memory_size_in_mb
    Type: INT32
    Provider name: MemorySizeInMB
    Description: The memory size of your serverless endpoint. Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.
  - provisioned_concurrency
    Type: INT32
    Provider name: ProvisionedConcurrency
    Description: The amount of provisioned concurrency to allocate for the serverless endpoint. Should be less than or equal to MaxConcurrency. This field is not supported for serverless endpoint recommendations for Inference Recommender jobs. For more information about creating an Inference Recommender job, see CreateInferenceRecommendationsJobs.
endpoints
Type: UNORDERED_LIST_STRUCT
Provider name: Endpoints
Description: Existing customer endpoints on which to run an Inference Recommender job.
- endpoint_name
  Type: STRING
  Provider name: EndpointName
  Description: The name of a customer’s endpoint.
job_duration_in_seconds
Type: INT32
Provider name: JobDurationInSeconds
Description: Specifies the maximum duration of the job, in seconds. The maximum value is 18,000 seconds.
model_name
Type: STRING
Provider name: ModelName
Description: The name of the created model.
model_package_version_arn
Type: STRING
Provider name: ModelPackageVersionArn
Description: The Amazon Resource Name (ARN) of a versioned model package.
resource_limit
Type: STRUCT
Provider name: ResourceLimit
Description: Defines the resource limit of the job.
- max_number_of_tests
  Type: INT32
  Provider name: MaxNumberOfTests
  Description: Defines the maximum number of load tests.
- max_parallel_of_tests
  Type: INT32
  Provider name: MaxParallelOfTests
  Description: Defines the maximum number of parallel load tests.
traffic_pattern
Type: STRUCT
Provider name: TrafficPattern
Description: Specifies the traffic pattern of the job.
- phases
  Type: UNORDERED_LIST_STRUCT
  Provider name: Phases
  Description: Defines the phases traffic specification.
  - duration_in_seconds
    Type: INT32
    Provider name: DurationInSeconds
    Description: Specifies how long a traffic phase should be. For custom load tests, the value should be between 120 and 3600. This value should not exceed JobDurationInSeconds.
  - initial_number_of_users
    Type: INT32
    Provider name: InitialNumberOfUsers
    Description: Specifies how many concurrent users to start with. The value should be between 1 and 3.
  - spawn_rate
    Type: INT32
    Provider name: SpawnRate
    Description: Specified how many new users to spawn in a minute.
- stairs
  Type: STRUCT
  Provider name: Stairs
  Description: Defines the stairs traffic pattern.
  - duration_in_seconds
    Type: INT32
    Provider name: DurationInSeconds
    Description: Defines how long each traffic step should be.
  - number_of_steps
    Type: INT32
    Provider name: NumberOfSteps
    Description: Specifies how many steps to perform during traffic.
  - users_per_step
    Type: INT32
    Provider name: UsersPerStep
    Description: Specifies how many new users to spawn in each step.
- traffic_type
  Type: STRING
  Provider name: TrafficType
  Description: Defines the traffic patterns. Choose either PHASES or STAIRS.
volume_kms_key_id
Type: STRING
Provider name: VolumeKmsKeyId
Description: The Amazon Resource Name (ARN) of a Amazon Web Services Key Management Service (Amazon Web Services KMS) key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance that hosts the endpoint. This key will be passed to SageMaker Hosting for endpoint creation. The SageMaker execution role must have kms:CreateGrant permission in order to encrypt data on the storage volume of the endpoints created for inference recommendation. The inference recommendation job will fail asynchronously during endpoint configuration creation if the role passed does not have kms:CreateGrant permission. The KmsKeyId can be any of the following formats:
- // KMS Key ID “1234abcd-12ab-34cd-56ef-1234567890ab”
- // Amazon Resource Name (ARN) of a KMS Key “arn:aws:kms:<region>:<account>:key/<key-id-12ab-34cd-56ef-1234567890ab>"
- // KMS Key Alias “alias/ExampleAlias”
- // Amazon Resource Name (ARN) of a KMS Key Alias “arn:aws:kms:<region>:<account>:alias/<ExampleAlias>"
For more information about key identifiers, see Key identifiers (KeyID) in the Amazon Web Services Key Management Service (Amazon Web Services KMS) documentation.
vpc_config
Type: STRUCT
Provider name: VpcConfig
Description: Inference Recommender provisions SageMaker endpoints with access to VPC in the inference recommendation job.
- security_group_ids
  Type: UNORDERED_LIST_STRING
  Provider name: SecurityGroupIds
  Description: The VPC security group IDs. IDs have the form of sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.
- subnets
  Type: UNORDERED_LIST_STRING
  Provider name: Subnets
  Description: The ID of the subnets in the VPC to which you want to connect your model.

`job_arn`

Type: STRING
Provider name: JobArn
Description: The Amazon Resource Name (ARN) of the job.

`job_description`

Type: STRING
Provider name: JobDescription
Description: The job description that you provided when you initiated the job.

`job_name`

Type: STRING
Provider name: JobName
Description: The name of the job. The name must be unique within an Amazon Web Services Region in the Amazon Web Services account.

`job_type`

Type: STRING
Provider name: JobType
Description: The job type that you provided when you initiated the job.

`last_modified_time`

Type: TIMESTAMP
Provider name: LastModifiedTime
Description: A timestamp that shows when the job was last modified.

`role_arn`

Type: STRING
Provider name: RoleArn
Description: The Amazon Resource Name (ARN) of the Amazon Web Services Identity and Access Management (IAM) role you provided when you initiated the job.

`status`

Type: STRING
Provider name: Status
Description: The status of the job.

`stopping_conditions`

Type: STRUCT
Provider name: StoppingConditions
Description: The stopping conditions that you provided when you initiated the job.

flat_invocations
Type: STRING
Provider name: FlatInvocations
Description: Stops a load test when the number of invocations (TPS) peaks and flattens, which means that the instance has reached capacity. The default value is Stop. If you want the load test to continue after invocations have flattened, set the value to Continue.
max_invocations
Type: INT32
Provider name: MaxInvocations
Description: The maximum number of requests per minute expected for the endpoint.
model_latency_thresholds
Type: UNORDERED_LIST_STRUCT
Provider name: ModelLatencyThresholds
Description: The interval of time taken by a model to respond as viewed from SageMaker. The interval includes the local communication time taken to send the request and to fetch the response from the container of a model and the time taken to complete the inference in the container.
- percentile
  Type: STRING
  Provider name: Percentile
  Description: The model latency percentile threshold. Acceptable values are P95 and P99. For custom load tests, specify the value as P95.
- value_in_milliseconds
  Type: INT32
  Provider name: ValueInMilliseconds
  Description: The model latency percentile value in milliseconds.

`tags`

Type: UNORDERED_LIST_STRING