Sensitive Data Scanner

Overview

Sensitive data, such as credit card numbers, bank routing numbers, and API keys are often exposed unintentionally in application logs, APM spans, and RUM events, which can expose your organization to financial and privacy risks.

Often, businesses are required to identify, remediate, and prevent the exposure of such sensitive data within their logs due to organizational policies, compliance requirements, industry regulations, and privacy concerns. This is especially true within industries such as banking, financial services, healthcare, and insurance.

Sensitive Data Scanner

Sensitive Data Scanner is a stream-based, pattern matching service that you can use to identify, tag, and optionally redact or hash sensitive data. Security and compliance teams can implement Sensitive Data Scanner as a new line of defense, helping prevent against sensitive data leaks and limiting non-compliance risks.

Sensitive Data Scanner can be found under Organization Settings.

Sensitive Data Scanner in Organization Settings

Setup

  • Define Scanning Groups: A scanning group determines what data to scan. It consists of a query filter and a set of toggles to enable scanning for Logs, APM, RUM, and/or Events. See the Log Search Syntax documentation to learn more about query filters.
  • Define Scanning Rules: A scanning rule determines what sensitive information to match within the data. Within a scanning group, add predefined scanning rules from Datadog’s Scanning Rule Library or create your own rules from scratch to scan using custom regex patterns.

Sensitive Data Scanner supports Perl Compatible RegEx (PCRE), but the following patterns are not supported:

  • Backreferences and capturing sub-expressions (lookarounds)
  • Arbitrary zero-width assertions
  • Subroutine references and recursive patterns
  • Conditional patterns
  • Backtracking control verbs
  • The \C “single-byte” directive (which breaks UTF-8 sequences)
  • The \R newline match
  • The \K start of match reset directive
  • Callouts and embedded code
  • Atomic grouping and possessive quantifiers

Note:

  • Any rules that you add or update only affect data coming into Datadog after the rule was defined.
  • Sensitive Data Scanner does not affect any rules you define on the Datadog Agent directly.
  • To turn off Sensitive Data Scanner entirely, set the toggle to off for each Scanning Group and Scanning Rule so that they are disabled.

Custom Scanning Rules

  • Define pattern: Specify the regex pattern to be used for matching against events. Test with sample data to verify that your regex pattern is valid.
  • Define scope: Specify whether you want to scan the entire event or just specific attributes. You can also choose to exclude specific attributes from the scan.
  • Add tags: Specify the tags you want to associate with events where the values match the specified regex pattern. Datadog recommends using sensitive_data and sensitive_data_category tags. These tags can then be used in searches, dashboards, and monitors.
  • Process matching values: Optionally, specify whether you want to redact, partially redact, or hash matching values. When redacting, specify placeholder text to replace the matching values with. When partially redacting, specify the position (start/end) and length (# of characters) to redact within matching values. Redaction, partial redaction, and hashing are all irreversible actions.
  • Name the rule: Provide a human-readable name for the rule.
A Sensitive Data Scanner custom rule

Out-of-the-box Scanning Rules

The Scanning Rule Library contains an evergrowing collection of predefined rules maintained by Datadog for detecting common patterns such as email addresses, credit card numbers, API keys, authorization tokens, and more.

Scanning Rule Library

Permissions

By default, users with the Datadog Admin role have access to view and define the scanning rules. To allow other user access, grant read or write permissions for Data Scanner under Compliance. See the Custom RBAC documentation for details on Roles and Permissions.

Permissions for Sensitive Data Scanner

Using tags with Query based RBAC

Control who can access events containing sensitive data. Use tags added by Sensitive Data Scanner to build queries with RBAC and restrict access to specific individuals or teams until the data ages out after the retention period.

Out-of-the-box dashboard

When Sensitive Data Scanner is enabled, an out-of-the-box dashboard summarizing sensitive data findings is automatically installed in your account.

Sensitive Data Scanner Overview dashboard

To access this dashboard, go to Dashboards > Dashboards List and search for Sensitive Data Scanner Overview.

Further Reading