Spa

SPA (Spark Pod Autosizing) API. Provides resource recommendations and cost insights to help optimize Spark job configurations.

Note: This endpoint is in public beta and may change in the future. It is not yet recommended for production use.

GET https://api.ap1.datadoghq.com/api/v2/spa/recommendations/{service}/{shard}https://api.ap2.datadoghq.com/api/v2/spa/recommendations/{service}/{shard}https://api.datadoghq.eu/api/v2/spa/recommendations/{service}/{shard}https://api.ddog-gov.com/api/v2/spa/recommendations/{service}/{shard}https://api.datadoghq.com/api/v2/spa/recommendations/{service}/{shard}https://api.us3.datadoghq.com/api/v2/spa/recommendations/{service}/{shard}https://api.us5.datadoghq.com/api/v2/spa/recommendations/{service}/{shard}

Overview

Retrieve resource recommendations for a Spark job. The caller (Spark Gateway or DJM UI) provides a service name and shard identifier, and SPA returns structured recommendations for driver and executor resources.

Arguments

Path Parameters

Name

Type

Description

shard [required]

string

The shard tag for a spark job, which differentiates jobs within the same service that have different resource needs

service [required]

string

The service name for a spark job

Response

OK

Expand All

Field

Type

Description

data [required]

object

JSON:API resource object for SPA Recommendation. Includes type, optional ID, and resource attributes with structured recommendations.

attributes [required]

object

Attributes of the SPA Recommendation resource. Contains recommendations for both driver and executor components.

driver [required]

object

Resource recommendation for a single Spark component (driver or executor). Contains estimation data used to patch Spark job specs.

estimation [required]

object

Recommended resource values for a Spark driver or executor, derived from recent real usage metrics. Used by SPA to propose more efficient pod sizing.

cpu

object

CPU usage statistics derived from historical Spark job metrics. Provides multiple estimates so users can choose between conservative and cost-saving risk profiles.

max

int64

Maximum CPU usage observed for the job, expressed in millicores. This represents the upper bound of usage.

p75

int64

75th percentile of CPU usage (millicores). Represents a cost-saving configuration while covering most workloads.

p95

int64

95th percentile of CPU usage (millicores). Balances performance and cost, providing a safer margin than p75.

ephemeral_storage

int64

Recommended ephemeral storage allocation (in MiB). Derived from job temporary storage patterns.

heap

int64

Recommended JVM heap size (in MiB).

memory

int64

Recommended total memory allocation (in MiB). Includes both heap and overhead.

overhead

int64

Recommended JVM overhead (in MiB). Computed as total memory - heap.

executor [required]

object

Resource recommendation for a single Spark component (driver or executor). Contains estimation data used to patch Spark job specs.

estimation [required]

object

Recommended resource values for a Spark driver or executor, derived from recent real usage metrics. Used by SPA to propose more efficient pod sizing.

cpu

object

CPU usage statistics derived from historical Spark job metrics. Provides multiple estimates so users can choose between conservative and cost-saving risk profiles.

max

int64

Maximum CPU usage observed for the job, expressed in millicores. This represents the upper bound of usage.

p75

int64

75th percentile of CPU usage (millicores). Represents a cost-saving configuration while covering most workloads.

p95

int64

95th percentile of CPU usage (millicores). Balances performance and cost, providing a safer margin than p75.

ephemeral_storage

int64

Recommended ephemeral storage allocation (in MiB). Derived from job temporary storage patterns.

heap

int64

Recommended JVM heap size (in MiB).

memory

int64

Recommended total memory allocation (in MiB). Includes both heap and overhead.

overhead

int64

Recommended JVM overhead (in MiB). Computed as total memory - heap.

id

string

Resource identifier for the recommendation. Optional in responses.

type [required]

enum

JSON:API resource type for Spark Pod Autosizing recommendations. Identifies the Recommendation resource returned by SPA. Allowed enum values: recommendation

default: recommendation

{
  "data": {
    "attributes": {
      "driver": {
        "estimation": {
          "cpu": {
            "max": "integer",
            "p75": "integer",
            "p95": "integer"
          },
          "ephemeral_storage": "integer",
          "heap": "integer",
          "memory": "integer",
          "overhead": "integer"
        }
      },
      "executor": {
        "estimation": {
          "cpu": {
            "max": "integer",
            "p75": "integer",
            "p95": "integer"
          },
          "ephemeral_storage": "integer",
          "heap": "integer",
          "memory": "integer",
          "overhead": "integer"
        }
      }
    },
    "id": "string",
    "type": "recommendation"
  }
}

JSON:API document containing a single Recommendation resource. Returned by SPA when the Spark Gateway requests recommendations.

Expand All

Field

Type

Description

data [required]

object

JSON:API resource object for SPA Recommendation. Includes type, optional ID, and resource attributes with structured recommendations.

attributes [required]

object

Attributes of the SPA Recommendation resource. Contains recommendations for both driver and executor components.

driver [required]

object

Resource recommendation for a single Spark component (driver or executor). Contains estimation data used to patch Spark job specs.

estimation [required]

object

Recommended resource values for a Spark driver or executor, derived from recent real usage metrics. Used by SPA to propose more efficient pod sizing.

cpu

object

CPU usage statistics derived from historical Spark job metrics. Provides multiple estimates so users can choose between conservative and cost-saving risk profiles.

max

int64

Maximum CPU usage observed for the job, expressed in millicores. This represents the upper bound of usage.

p75

int64

75th percentile of CPU usage (millicores). Represents a cost-saving configuration while covering most workloads.

p95

int64

95th percentile of CPU usage (millicores). Balances performance and cost, providing a safer margin than p75.

ephemeral_storage

int64

Recommended ephemeral storage allocation (in MiB). Derived from job temporary storage patterns.

heap

int64

Recommended JVM heap size (in MiB).

memory

int64

Recommended total memory allocation (in MiB). Includes both heap and overhead.

overhead

int64

Recommended JVM overhead (in MiB). Computed as total memory - heap.

executor [required]

object

Resource recommendation for a single Spark component (driver or executor). Contains estimation data used to patch Spark job specs.

estimation [required]

object

Recommended resource values for a Spark driver or executor, derived from recent real usage metrics. Used by SPA to propose more efficient pod sizing.

cpu

object

CPU usage statistics derived from historical Spark job metrics. Provides multiple estimates so users can choose between conservative and cost-saving risk profiles.

max

int64

Maximum CPU usage observed for the job, expressed in millicores. This represents the upper bound of usage.

p75

int64

75th percentile of CPU usage (millicores). Represents a cost-saving configuration while covering most workloads.

p95

int64

95th percentile of CPU usage (millicores). Balances performance and cost, providing a safer margin than p75.

ephemeral_storage

int64

Recommended ephemeral storage allocation (in MiB). Derived from job temporary storage patterns.

heap

int64

Recommended JVM heap size (in MiB).

memory

int64

Recommended total memory allocation (in MiB). Includes both heap and overhead.

overhead

int64

Recommended JVM overhead (in MiB). Computed as total memory - heap.

id

string

Resource identifier for the recommendation. Optional in responses.

type [required]

enum

JSON:API resource type for Spark Pod Autosizing recommendations. Identifies the Recommendation resource returned by SPA. Allowed enum values: recommendation

default: recommendation

{
  "data": {
    "attributes": {
      "driver": {
        "estimation": {
          "cpu": {
            "max": "integer",
            "p75": "integer",
            "p95": "integer"
          },
          "ephemeral_storage": "integer",
          "heap": "integer",
          "memory": "integer",
          "overhead": "integer"
        }
      },
      "executor": {
        "estimation": {
          "cpu": {
            "max": "integer",
            "p75": "integer",
            "p95": "integer"
          },
          "ephemeral_storage": "integer",
          "heap": "integer",
          "memory": "integer",
          "overhead": "integer"
        }
      }
    },
    "id": "string",
    "type": "recommendation"
  }
}

Bad Request

API error response.

Expand All

Field

Type

Description

errors [required]

[string]

A list of errors.

{
  "errors": [
    "Bad Request"
  ]
}

Not Authorized

API error response.

Expand All

Field

Type

Description

errors [required]

[string]

A list of errors.

{
  "errors": [
    "Bad Request"
  ]
}

Too many requests

API error response.

Expand All

Field

Type

Description

errors [required]

[string]

A list of errors.

{
  "errors": [
    "Bad Request"
  ]
}

Code Example

                  # Path parameters
export shard="CHANGE_ME"
export service="CHANGE_ME"
# Curl command
curl -X GET "https://api.ap1.datadoghq.com"https://api.ap2.datadoghq.com"https://api.datadoghq.eu"https://api.ddog-gov.com"https://api.datadoghq.com"https://api.us3.datadoghq.com"https://api.us5.datadoghq.com/api/v2/spa/recommendations/${service}/${shard}" \ -H "Accept: application/json" \ -H "DD-API-KEY: ${DD_API_KEY}" \ -H "DD-APPLICATION-KEY: ${DD_APP_KEY}"