---
title: 'Data Observability: Jobs Monitoring for Spark on Kubernetes'
description: >-
  Set up Data Observability: Jobs Monitoring for Apache Spark applications on
  Kubernetes clusters using the Datadog Agent and admission controller.
breadcrumbs: >-
  Docs > Data Observability Overview > Data Observability: Jobs Monitoring >
  Data Observability: Jobs Monitoring for Spark on Kubernetes
---

# Data Observability: Jobs Monitoring for Spark on Kubernetes

{% callout %}
# Important note for users on the following Datadog sites: app.ddog-gov.com

{% alert level="danger" %}
This product is not supported for your selected [Datadog site](https://docs.datadoghq.com/getting_started/site). ().
{% /alert %}

{% /callout %}

[Data Observability: Jobs Monitoring](https://docs.datadoghq.com/data_jobs) gives visibility into the performance and reliability of Apache Spark applications on Kubernetes.

## Setup{% #setup %}

{% alert level="info" %}
Data Observability: Jobs Monitoring requires Datadog Agent version 7.55.0 or later, and Java tracer version 1.38.0 or later.
{% /alert %}

Follow these steps to enable Data Observability: Jobs Monitoring for Spark on Kubernetes.

1. Install the Datadog Agent on your Kubernetes cluster.
1. Inject Spark instrumentation.

### Install the Datadog Agent on your Kubernetes cluster{% #install-the-datadog-agent-on-your-kubernetes-cluster %}

If you have already [installed the Datadog Agent on your Kubernetes cluster](https://docs.datadoghq.com/containers/kubernetes/installation/?tab=operator), ensure that you have enabled the [Datadog Admission Controller](https://docs.datadoghq.com/containers/cluster_agent/admission_controller/?tab=operator). You can then go to the next step, Inject Spark instrumentation.

You can install the Datadog Agent using the [Datadog Operator](https://docs.datadoghq.com/containers/datadog_operator) or [Helm](https://helm.sh).

{% tab title="Datadog Operator" %}
### Prerequisites{% #prerequisites %}

- Kubernetes cluster version v1.20.X+
- [`Helm`](https://helm.sh)
- The [`kubectl` CLI](https://kubernetes.io/docs/tasks/tools/install-kubectl/)

### Installation{% #installation %}

1. Install the Datadog Operator by running the following commands:

   ```shell
   helm repo add datadog https://helm.datadoghq.com
   helm install my-datadog-operator datadog/datadog-operator
   ```

1. Create a [Kubernetes Secret](https://kubernetes.io/docs/concepts/configuration/secret/) to store your Datadog API key.

   ```shell
   kubectl create secret generic datadog-secret --from-literal api-key=<DATADOG_API_KEY>
   ```

Replace `<DATADOG_API_KEY>` with your [Datadog API key](https://app.datadoghq.com/organization-settings/api-keys).

1. Create a file, `datadog-agent.yaml`, that contains the following configuration:

   ```yaml
   kind: DatadogAgent
   apiVersion: datadoghq.com/v2alpha1
   metadata:
     name: datadog
   spec:
     features:
       apm:
         enabled: true
         hostPortConfig:
           enabled: true
           hostPort: 8126
       admissionController:
         enabled: true
         mutateUnlabelled: false
       # (Optional) Uncomment the next three lines to enable logs collection
       # logCollection:
         # enabled: true
         # containerCollectAll: true
     global:
       site: <DATADOG_SITE>
       credentials:
         apiSecret:
           secretName: datadog-secret
           keyName: api-key
     override:
       nodeAgent:
         image:
           tag: <DATADOG_AGENT_VERSION>
   ```

Replace `<DATADOG_SITE>` with your [Datadog site](https://docs.datadoghq.com/getting_started/site). Your site is . (Ensure the correct SITE is selected on the right).

Replace `<DATADOG_AGENT_VERSION>` with version `7.55.0` or later.

**Optional**: Uncomment the `logCollection` section to start collecting application logs which will be correlated to Spark job run traces. Once enabled, logs are collected from all discovered containers by default. See the [Kubernetes log collection documentation](https://docs.datadoghq.com/containers/kubernetes/log/?tab=datadogoperator#log-collection) for more details on the setup process.

1. Deploy the Datadog Agent with the above configuration file:

   ```shell
   kubectl apply -f /path/to/your/datadog-agent.yaml
   ```

{% /tab %}

{% tab title="Helm" %}

1. Create a [Kubernetes Secret](https://kubernetes.io/docs/concepts/configuration/secret/) to store your Datadog API key.

   ```shell
   kubectl create secret generic datadog-secret --from-literal api-key=<DATADOG_API_KEY>
   ```

Replace `<DATADOG_API_KEY>` with your [Datadog API key](https://app.datadoghq.com/organization-settings/api-keys).

1. Create a file, `datadog-values.yaml`, that contains the following configuration:

   ```yaml
   datadog:
     apiKeyExistingSecret: datadog-secret
     site: <DATADOG_SITE>
     apm:
       portEnabled: true
       port: 8126
     # (Optional) Uncomment the next three lines to enable logs collection
     # logs:
       # enabled: true
       # containerCollectAll: true
   
   agents:
     image:
       tag: <DATADOG_AGENT_VERSION>
   
   clusterAgent:
     admissionController:
       enabled: true
       muteUnlabelled: false
   ```

Replace `<DATADOG_SITE>` with your [Datadog site](https://docs.datadoghq.com/getting_started/site). Your site is . (Ensure the correct SITE is selected on the right).

Replace `<DATADOG_AGENT_VERSION>` with version `7.55.0` or later.

**Optional**: Uncomment the logs section to start collecting application logs which will be correlated to Spark job run traces. Once enabled, logs are collected from all discovered containers by default. See the [Kubernetes log collection documentation](https://docs.datadoghq.com/containers/kubernetes/log/?tab=helm#log-collection) for more details on the setup process.

1. Run the following command:

   ```shell
   helm install <RELEASE_NAME> \
    -f datadog-values.yaml \
    --set targetSystem=<TARGET_SYSTEM> \
    datadog/datadog
   ```

   - Replace `<RELEASE_NAME>` with your release name. For example, `datadog-agent`.

   - Replace `<TARGET_SYSTEM>` with the name of your OS. For example, `linux` or `windows`.

{% /tab %}

### Inject Spark instrumentation{% #inject-spark-instrumentation %}

When you run your Spark job, use the following configurations:

{% dl %}

{% dt %}
`spark.kubernetes.{driver,executor}.label.admission.datadoghq.com/enabled` (Required)
{% /dt %}

{% dd %}
`true`
{% /dd %}

{% dt %}
`spark.kubernetes.{driver,executor}.annotation.admission.datadoghq.com/java-lib.version` (Required)
{% /dt %}

{% dd %}
`latest`
{% /dd %}

{% dt %}
`spark.{driver,executor}.extraJavaOptions`
{% /dt %}

{% dd %}

{% dl %}

{% dt %}
`-Ddd.data.jobs.enabled=true` (Required)
{% /dt %}

{% dd %}
`true`
{% /dd %}

{% dt %}
`-Ddd.service` (Optional)
{% /dt %}

{% dd %}
Your service name. Because this option sets the *job name* in Datadog, it is recommended that you use a human-readable name.
{% /dd %}

{% dt %}
`-Ddd.env` (Optional)
{% /dt %}

{% dd %}
Your environment, such as `prod` or `dev`.
{% /dd %}

{% dt %}
`-Ddd.version` (Optional)
{% /dt %}

{% dd %}
Your version.
{% /dd %}

{% dt %}
`-Ddd.tags` (Optional)
{% /dt %}

{% dd %}
Other tags you wish to add, in the format `<KEY_1>:<VALUE_1>,<KEY_2:VALUE_2>`.
{% /dd %}

{% /dl %}

{% /dd %}

{% /dl %}

#### Example: spark-submit{% #example-spark-submit %}

```shell
spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master k8s://<CLUSTER_ENDPOINT> \
  --conf spark.kubernetes.container.image=895885662937.dkr.ecr.us-west-2.amazonaws.com/spark/emr-6.10.0:latest \
  --deploy-mode cluster \
  --conf spark.kubernetes.namespace=<NAMESPACE> \
  --conf spark.kubernetes.authenticate.driver.serviceAccountName=<SERVICE_ACCOUNT> \
  --conf spark.kubernetes.authenticate.executor.serviceAccountName=<SERVICE_ACCOUNT> \
  --conf spark.kubernetes.driver.label.admission.datadoghq.com/enabled=true \
  --conf spark.kubernetes.executor.label.admission.datadoghq.com/enabled=true \
  --conf spark.kubernetes.driver.annotation.admission.datadoghq.com/java-lib.version=latest \
  --conf spark.kubernetes.executor.annotation.admission.datadoghq.com/java-lib.version=latest \
  --conf spark.driver.extraJavaOptions="-Ddd.data.jobs.enabled=true -Ddd.service=<JOB_NAME> -Ddd.env=<ENV> -Ddd.version=<VERSION> -Ddd.tags=<KEY_1>:<VALUE_1>,<KEY_2:VALUE_2>" \
  --conf spark.executor.extraJavaOptions="-Ddd.data.jobs.enabled=true -Ddd.service=<JOB_NAME> -Ddd.env=<ENV> -Ddd.version=<VERSION> -Ddd.tags=<KEY_1>:<VALUE_1>,<KEY_2:VALUE_2>" \
  local:///usr/lib/spark/examples/jars/spark-examples.jar 20
```

#### Example: AWS start-job-run{% #example-aws-start-job-run %}

```shell
aws emr-containers start-job-run \
--virtual-cluster-id <EMR_CLUSTER_ID> \
--name myjob \
--execution-role-arn <EXECUTION_ROLE_ARN> \
--release-label emr-6.10.0-latest \
--job-driver '{
  "sparkSubmitJobDriver": {
    "entryPoint": "s3://BUCKET/spark-examples.jar",
    "sparkSubmitParameters": "--class <MAIN_CLASS> --conf spark.kubernetes.driver.label.admission.datadoghq.com/enabled=true --conf spark.kubernetes.executor.label.admission.datadoghq.com/enabled=true --conf spark.kubernetes.driver.annotation.admission.datadoghq.com/java-lib.version=latest --conf spark.kubernetes.executor.annotation.admission.datadoghq.com/java-lib.version=latest --conf spark.driver.extraJavaOptions=\"-Ddd.data.jobs.enabled=true -Ddd.service=<JOB_NAME> -Ddd.env=<ENV> -Ddd.version=<VERSION> -Ddd.tags=<KEY_1>:<VALUE_1>,<KEY_2:VALUE_2>\"  --conf spark.executor.extraJavaOptions=\"-Ddd.data.jobs.enabled=true -Ddd.service=<JOB_NAME> -Ddd.env=<ENV> -Ddd.version=<VERSION> -Ddd.tags=<KEY_1>:<VALUE_1>,<KEY_2:VALUE_2>\""
  }
}
```

## Validation{% #validation %}

In Datadog, view the [Data Observability: Jobs Monitoring](https://app.datadoghq.com/data-jobs/) page to see a list of all your data processing jobs.

## Advanced Configuration{% #advanced-configuration %}

### Tag spans at runtime{% #tag-spans-at-runtime %}

You can set tags on Spark spans at runtime. These tags are applied *only* to spans that start after the tag is added.

```scala
// Add tag for all next Spark computations
sparkContext.setLocalProperty("spark.datadog.tags.key", "value")
spark.read.parquet(...)
```

To remove a runtime tag:

```scala
// Remove tag for all next Spark computations
sparkContext.setLocalProperty("spark.datadog.tags.key", null)
```

## Further Reading{% #further-reading %}

- [Data Observability: Jobs Monitoring](https://docs.datadoghq.com/data_jobs)
