Sensitive Data Scanner

Docs > Datadog Security > Sensitive Data Scanner

Overview

Sensitive data, such as credit card numbers, API keys, IP addresses, and personally identifiable information (PII) are often leaked unintentionally, which can expose your organization to security and compliance risks. Sensitive data can be found in your telemetry data, such as application logs, APM spans, RUM events, events from Event Management. It can also be unintentionally moved to cloud storage resources when engineering teams move their workloads to the cloud. Datadog’s Sensitive Data Scanner can help prevent sensitive data leaks and limit non-compliance risks by discovering, classifying, and optionally redacting sensitive data.

Note: See PCI DSS Compliance for information on setting up a PCI-compliant Datadog organization.

Scan telemetry data

Five different sensitive issues detected where two have critical priority, one has medium priority, and two are info.

Sensitive Data Scanner can scan your data in the cloud or within your environment.

In the Cloud

With Sensitive Data Scanner in the Cloud, you submit logs and events to the Datadog backend, so the data leaves your environment before it gets redacted. The logs and events are scanned and redacted in the Datadog backend during processing, so sensitive data is redacted before events are indexed and shown in the Datadog UI.

The data that can be scanned and redacted are:

Logs: All structured and unstructured log content, including log message and attribute values
APM: Span attribute values only
RUM: Event attribute values only
Events: Event attribute values only

Join the Preview!

Role-based sensitive data unmasking in logs is in Preview. To enroll, click Request Access.

Request Access

To use Sensitive Data Scanner, set up a scanning group to define what data to scan and then set up scanning rules to determine what sensitive information to match within the data. For scanning rules you can:

Add predefined scanning rules from Datadog’s Scanning Rule Library. These rules detect common patterns such as email addresses, credit card numbers, API keys, authorization tokens, network and device information, and more.
Create your own rules using regex patterns.

See Set Up Sensitive Data Scanner for Telemetry Data for setup details.

In your environment

Use Observability Pipelines to collect and process your logs within your environment, and then route the data to their downstream integrations. When you set up a pipeline in Observability Pipelines, add the Sensitive Data Scanner processor to redact sensitive data in your logs before they leave your premises. You can add predefined scanning rules from the Rule Library, such as email addresses, credit card numbers, API keys, authorization tokens, IP addresses, and more. You can also create your own rules using regex patterns.

See Set Up Pipelines for more information.

Scan cloud storage

Limited Availability

Scanning support for Amazon S3 buckets and RDS instances is in Limited Availability. To enroll, click Request Access.

Request Access

The Summary page's datastore section with three Amazon S3 issues

If you have Sensitive Data Scanner enabled, you can catalog and classify sensitive data in your Amazon S3 buckets and RDS instances. Note: Sensitive Data Scanner does not redact sensitive data in your cloud storage resources.

Sensitive Data Scanner scans for sensitive data by deploying Agentless scanners in your cloud environments. These scanning instances retrieve a list of all S3 buckets and RDS instances through Remote Configuration, and have set instructions to scan text files—such as CSVs and JSONs—and tables in every datastore over time.

Sensitive Data Scanner leverages its entire rules library to find matches. When a match is found, the location of the match is sent to Datadog by the scanning instance. Note: Data stores and their files are only read in your environment—no sensitive data that was scanned is sent back to Datadog.

Along with displaying sensitive data matches, Sensitive Data Scanner surfaces any security issues detected by Cloud Security affecting the sensitive data stores. You can click any issue to continue triage and remediation within Cloud Security.

See Set up Sensitive Data Scanner for Cloud Storage for setup details.

Investigate sensitive data issues

The summary page showing an overview of sensitive issues broken down by priority

Use the Summary page to see details of sensitive data issues identified by your scanning rules. These details include:

The specific scanning rule that detected the matches, so that you can determine which rules to modify as needed.
The scanning group in which the issue has occurred, so that you can determine the blast radius of any leaks.
The number of events associated with the issue to help you gauge its scope and severity.
A graph of the events associated with the issue to help you pinpoint when an issue started and see how it has progressed.
Related cases created for the issue.

See Investigate Sensitive Data Issues for more information on how to use the Summary page to triage your sensitive data issues.

Review sensitive data trends

Sensitive Data Scanner Overview dashboard

When Sensitive Data Scanner is enabled, an out-of-the-box dashboard summarizing sensitive data issues is automatically installed in your account. To access this dashboard, navigate to Dashboards > Dashboards List and search for “Sensitive Data Scanner Overview”.