Set up SCA in your repositories

Docs > Plateforme de sécurité Datadog > Code Security > Software Composition Analysis > Set up SCA in your repositories

Cette page n'est pas encore disponible en français, sa traduction est en cours.
Si vous avez des questions ou des retours sur notre projet de traduction actuel, n'hésitez pas à nous contacter.

Overview

Datadog Software Composition Analysis (SCA) scans your repositories for open-source libraries and detects known security vulnerabilities before you ship to production.

To get started:

Open Code Security settings.
In Activate scanning for your repositories, click Manage Repositories.
Choose where to run SCA scans (Datadog-hosted or CI pipelines).
Follow the setup instructions for your source code provider.

Supported languages and dependency manifests

Datadog SCA scans libraries in the following languages using dependency manifests (such as lockfiles and other supported manifest files) to identify vulnerable dependencies.

Language	Package Manager	File
C#	.NET	`packages.lock.json`, `.csproj` files
C++	Conan	`conan.lock`
Go	mod	`go.mod`
JVM	Gradle	`gradle.lockfile`
JVM	Maven	`pom.xml`
Node.js	npm	`package-lock.json`
Node.js	pnpm	`pnpm-lock.yaml`
Node.js	yarn	`yarn.lock`
PHP	composer	`composer.lock`
Python	PDM	`pdm.lock`
Python	pip	`requirements.txt`, `Pipfile.lock`
Python	poetry	`poetry.lock`
Python	UV	`uv.lock`
Ruby	bundler	`Gemfile.lock`
Rust	Cargo	`cargo.lock`

Note: If both a packages.lock.json and a .csproj file are present, the packages.lock.json takes precedence and provides more precise version resolution.

Select where to run static SCA scans

By default, scans are automatically run upon each commit to a lockfile within an enabled repository. Default branch results are updated every hour to detect new vulnerabilities on existing packages.

Scan with Datadog-hosted scanning

You can run Datadog Static SCA scans directly on Datadog infrastructure. Supported repository types include:

GitHub (excluding repositories that use Git Large File Storage)
GitLab.com and GitLab Self-Managed
Azure DevOps

To get started, navigate to the Code Security page.

Datadog-hosted SCA scanning is not supported for repositories that contain file names longer than 255 characters.
For these cases, scan using CI Pipelines.

Scan in CI pipelines

Datadog Static Code Analysis runs in your CI pipelines using the datadog-ci CLI.

Configure your Datadog API and application keys by adding DD_APP_KEY and DD_API_KEY as secrets. Make sure the application key has the code_analysis_read scope.

You must scan your default branch at least once before results appear in Code Security.

Select your source code management provider

Datadog SCA supports all source code management providers, with native support for GitHub, GitLab, and Azure DevOps.

Configure a GitHub App with the GitHub integration tile and set up the source code integration to enable inline code snippets and pull request comments.

When installing a GitHub App, the following permissions are required to enable certain features:

Content: Read, which allows you to see code snippets displayed in Datadog
Pull Request: Read & Write, which allows Datadog to add feedback for violations directly in your pull requests using pull request comments.
Checks: Read & Write, which allows you to create checks on SAST violations to block pull requests

See the GitLab source code setup instructions to connect GitLab to Datadog. Both GitLab.com and Self-Managed instances are supported.

Note: Your Azure DevOps integrations must be connected to a Microsoft Entra tenant. Azure DevOps Server is not supported.

See the Azure source code setup instructions to connect Azure DevOps repositories to Datadog.

If you are using another source code management provider, configure SCA to run in your CI pipelines using the datadog-ci CLI tool and upload the results to Datadog.

Authentication

To upload results to Datadog, you must be authenticated. To ensure you’re authenticated, configure the following environment variables:

Name	Description	Required	Default
`DD_API_KEY`	Your Datadog API key. This key is created by your Datadog organization and should be stored as a secret.	Yes
`DD_APP_KEY`	Your Datadog application key. This key, created by your Datadog organization, should include the `code_analysis_read` scope and be stored as a secret.	Yes
`DD_SITE`	The Datadog site to send information to. Your Datadog site is .	No	`datadoghq.com`

Running options

There are two ways to run SCA scans from within your CI Pipelines:

Via Pipelines Integration (GitHub Actions, Azure DevOps)
Via Customizable Script (for any provider)

Run Via Pipelines Integration

You can run SCA scans automatically as part of your CI/CD workflows using built-in integrations for popular CI providers.

Datadog Software Composition Analysis CI jobs are only supported on push event trigger. Other event triggers (pull_request, for example) are not supported and can cause issues with the product.

GitHub Actions

SCA can run as a job in your GitHub Actions workflows. The action provided below invokes Datadog’s recommended SBOM tool, Datadog SBOM Generator, on your codebase and uploads the results into Datadog.

Add the following code snippet in .github/workflows/datadog-sca.yml.

Make sure to replace the dd_site attribute with the Datadog site you are using.

datadog-sca.yml

on: [push]

name: Datadog Software Composition Analysis

jobs:
  software-composition-analysis:
    runs-on: ubuntu-latest
    name: Datadog SBOM Generation and Upload
    steps:
    - name: Checkout
      uses: actions/checkout@v3
    - name: Check imported libraries are secure and compliant
      id: datadog-software-composition-analysis
      uses: DataDog/datadog-sca-github-action@main
      with:
        dd_api_key: ${{ secrets.DD_API_KEY }}
        dd_app_key: ${{ secrets.DD_APP_KEY }}
        dd_site: "datadoghq.com"

Related GitHub Actions

Datadog Static Code Analysis (SAST) analyzes your first-party code. Static Code Analysis can be set up using the datadog-static-analyzer-github-action GitHub action.

Azure DevOps Pipelines

To add a new pipeline in Azure DevOps, go to Pipelines > New Pipeline, select your repository, and then create/select a pipeline.

Add the following content to your Azure DevOps pipeline YAML file:

datadog-sca.yml

trigger:
  branches:
    include:
      # Optionally specify a specific branch to trigger on when merging
      - "*"

variables:
  - group: "Datadog"

jobs:
  - job: DatadogSoftwareCompositionAnalysis
    displayName: "Datadog Software Composition Analysis"
    steps:
      - script: |
          npm install -g @datadog/datadog-ci
          export DATADOG_OSV_SCANNER_URL="https://github.com/DataDog/datadog-sbom-generator/releases/latest/download/datadog-sbom-generator_linux_amd64.zip"
          mkdir -p /tmp/datadog-sbom-generator
          curl -L -o /tmp/datadog-sbom-generator/datadog-sbom-generator.zip $DATADOG_OSV_SCANNER_URL
          unzip /tmp/datadog-sbom-generator/datadog-sbom-generator.zip -d /tmp/datadog-sbom-generator
          chmod 755 /tmp/datadog-sbom-generator/datadog-sbom-generator
          /tmp/datadog-sbom-generator/datadog-sbom-generator scan --output=/tmp/sbom.json .
          datadog-ci sbom upload /tmp/sbom.json
        env:
          DD_APP_KEY: $(DD_APP_KEY)
          DD_API_KEY: $(DD_API_KEY)
          DD_SITE: datadoghq.com

For all other providers, use the customizable script in the section below to run SCA scans and upload results to Datadog.

Run Via Customizable Script

If you use a different CI provider or want more control, you can run SCA scans using a customizable script. This approach lets you manually install and run the scanner, then upload results to Datadog from any environment.

For non-GitHub repositories, run your first scan on the default branch.
If your branch name is custom (not master, main, default, stable, source, prod, or develop), upload once and set the default branch in Repository Settings.

Prerequisites:

Unzip
Node.js 14 or later

# Set the Datadog site to send information to
export DD_SITE=""

# Install dependencies
npm install -g @datadog/datadog-ci

# Download the latest Datadog SBOM Generator:
# https://github.com/DataDog/datadog-sbom-generator/releases
DATADOG_SBOM_GENERATOR_URL=https://github.com/DataDog/datadog-sbom-generator/releases/latest/download/datadog-sbom-generator_linux_amd64.zip

# Install Datadog SBOM Generator
mkdir /datadog-sbom-generator
curl -L -o /datadog-sbom-generator/datadog-sbom-generator.zip $DATADOG_SBOM_GENERATOR_URL
unzip /datadog-sbom-generator/datadog-sbom-generator.zip -d /datadog-sbom-generator
chmod 755 /datadog-sbom-generator/datadog-sbom-generator

# Run Datadog SBOM Generator to scan your dependencies
/datadog-sbom-generator/datadog-sbom-generator scan --output=/tmp/sbom.json /path/to/repository

# Upload results to Datadog
datadog-ci sbom upload /tmp/sbom.json

This script uses the Linux x86_64 datadog-sbom-generator. For other systems, update the download URL. See all releases here.

Upload third-party SBOM to Datadog

Datadog recommends using the Datadog SBOM generator, but it is also possible to ingest a third-party SBOM.

You can upload SBOMs generated by other tools if they meet these requirements:

Valid CycloneDX 1.4, 1.5, or 1.6 JSON schema
All components have type library
All components have a valid purl attribute

Third-party SBOM files are uploaded to Datadog using the datadog-ci command.

You can find optional arguments and other information in the datadog-ci README.

You can use the following command to upload your third-party SBOM. Ensure the environment variables DD_API_KEY, DD_APP_KEY, and DD_SITE are set to your API key, APP key, and Datadog site, respectively.

datadog-ci sbom upload /path/to/third-party-sbom.json

If you already have automatic scanning enabled for a repository, a manual upload will replace any existing result for that commit.

Link findings to Datadog services and teams

Datadog associates code and library scan results with Datadog services and teams to automatically route findings to the appropriate owners. This enables service-level visibility, ownership-based workflows, and faster remediation.

To determine the service where a vulnerability belongs, Datadog evaluates several mapping mechanisms in the order listed in this section.

Each vulnerability is mapped with one method only: if a mapping mechanism succeeds for a particular finding, Datadog does not attempt the remaining mechanisms for that finding.

Using service definitions that include code locations in the Software Catalog is the only way to explicitly control how static findings are mapped to services. The additional mechanisms described below, such as Error Tracking usage patterns and naming-based inference, are not user-configurable and depend on existing data from other Datadog products. Consequently, these mechanisms might not provide consistent mappings for organizations not using these products.

Mapping using the Software Catalog (recommended)

Services in the Software Catalog identify their codebase content using the codeLocations field. This field is available in the Software Catalog schema version v3 and allows a service to specify:

a repository URL

apiVersion: v3
kind: service
metadata:
  name: billing-service
  owner: billing-team
datadog:
  codeLocations:
    - repositoryURL: https://github.com/org/myrepo.git

one or more code paths inside that repository

apiVersion: v3
kind: service
metadata:
  name: billing-service
  owner: billing-team
datadog:
  codeLocations:
    - repositoryURL: https://github.com/org/myrepo.git
      paths:
        - path/to/service/code/**

If you want all the files in a repository to be associated with a service, you can use the glob ** as follows:

apiVersion: v3
kind: service
metadata:
  name: billing-service
  owner: billing-team
datadog:
  codeLocations:
    - repositoryURL: https://github.com/org/myrepo.git
      paths:
        - path/to/service/code/**
    - repositoryURL: https://github.com/org/billing-service.git
      paths:
        - "**"

The schema for this field is described in the Software Catalog entity model.

Datadog goes through all Software Catalog definitions and checks whether the finding’s file path matches. For a finding to be mapped to a service through codeLocations, it must contain a file path.

Some findings might not contain a file path. In those cases, Datadog cannot evaluate codeLocations for that finding, and this mechanism is skipped.

Services defined with a Software Catalog schema v2.x do not support codeLocations. Existing definitions can be upgraded to the v3 schema in the Software Catalog. After migration is completed, changes might take up to 24 hours to apply to findings. If you are unable to upgrade to v3, Datadog falls back to alternative linking techniques (described below). These rely on less precise heuristics, so accuracy might vary depending on the Code Security product and your use of other Datadog features.

Example (v3 schema)

apiVersion: v3
kind: service
metadata:
  name: billing-service
  owner: billing-team
datadog:
  codeLocations:
    - repositoryURL: https://github.com/org/myrepo.git
      paths:
        - path/to/service/code/**
    - repositoryURL: https://github.com/org/billing-service.git
      paths:
        - "**"

SAST finding

If a vulnerability appeared in github.com/org/myrepo at /src/billing/models/payment.py, then using the codeLocations for billing-service Datadog would add billing-service as an owning service. If your service defines an owner (see above), then Datadog links that team to the finding too. In this case, the finding would be linked to the billing-team.

SCA finding

If a library was declared in github.com/org/myrepo at /go.mod, then Datadog would not match it to billing-service.

Instead, if it was declared in github.com/org/billing-service at /go.mod, then Datadog would match it to billing-service due to the “**” catch-all glob. Consequently, Datadog would link the finding to the billing-team.

Datadog attempts to map a single finding to as many services as possible. If no matches are found, Datadog continues onto the next linking method.

When the Software Catalog cannot determine the service

If the Software Catalog does not provide a match, either because the finding’s file path does not match any codeLocations, or because the service uses the v2.x schema, Datadog evaluates whether Error Tracking can identify the service associated with the code. Datadog uses only the last 30 days of Error Tracking data due to product data-retention limits.

When Error Tracking processes stack traces, the traces often include file paths. For example, if an error occurs in: /foo/bar/baz.py, Datadog inspects the directory: /foo/bar. Datadog then checks whether the finding’s file path resides under that directory.

If the finding file is under the same directory:

Datadog treats this as a strong indication that the vulnerability belongs to the same service.
The finding inherits the service and team associated with that error in Error Tracking.

If this mapping succeeds, Datadog stops here.

Service inference from file paths or repository names

When neither of the above strategies can determine the service, Datadog inspects naming patterns in the repository and file paths.

Datadog evaluates whether:

The file path contains identifiers matching a known service.
The repository name corresponds to a service name.

When using the finding’s file path, Datadog performs a reverse search on each path segment until it finds a matching service or exhausts all options.

For example, if a finding occurs in github.com/org/checkout-service at /foo/bar/baz/main.go, Datadog takes the last path segment, main, and sees if any Software Catalog service uses that name. If there is a match, the finding is attributed to that service. If not, the process continues with baz, then bar, and so on.

When all options have been tried, Datadog checks whether the repository name, checkout-service, matches a Software Catalog service name. If no match is found, Datadog is unsuccessful at linking your finding using Software Catalog.

This mechanism ensures that findings receive meaningful service attribution when no explicit metadata exists.

Link findings to teams through Code Owners

If Datadog is able to link your finding to a service using the above strategies, then the team that owns that service (if defined) is associated with that finding automatically.

Regardless of whether Datadog successfully links a finding to a service (and a Datadog team), Datadog uses the CODEOWNERS information from your finding’s repository to link Datadog and GitHub teams to your findings.

You must accurately map your Git provider teams to your Datadog Teams for team attribution to function properly.

Filter by reachable vulnerabilities

Datadog offers static reachability analysis to help teams assess whether vulnerable code paths in dependencies are referenced within their application code. This capability supports more effective prioritization by identifying vulnerabilities that are statically unreachable and therefore present minimal immediate risk.

This functionality is supported only when using the Datadog SBOM Generator with the --reachability flag enabled or when running scans through Datadog-hosted infrastructure.

Reachability analysis is available exclusively for Java projects and applies only to a defined set of vetted security advisories. Vulnerabilities not included in this set are excluded from reachability evaluation.