Data Security

Docs > APM > Data Security

Cette page n'est pas encore disponible en français, sa traduction est en cours.
Si vous avez des questions ou des retours sur notre projet de traduction actuel, n'hésitez pas à nous contacter.

Overview

Datadog tracing libraries collect data from an instrumented application. That data is sent to Datadog as traces and it may contain sensitive data such as personally identifiable information (PII). If you are ingesting sensitive data as traces into Datadog, remediations can be added at ingestion with Sensitive Data Scanner. You can also configure the Datadog Agent or the tracing library to remediate sensitive data at collection before traces are sent to Datadog. Datadog’s tools and policies comply with PCI v4.0. For more information, see PCI DSS Compliance.

If the configurations described here do not cover your compliance requirements, reach out to the Datadog support team.

Personal information in trace data

Datadog’s APM tracing libraries collect relevant observability data about your applications. Because these libraries collect hundreds of unique attributes in trace data, this page describes categories of data, with a focus on attributes that may contain personal information about your employees and end-users.

The table below describes the personal data categories collected by the automatic instrumentation provided by the tracing libraries, with some common examples listed.

Category	Description
Name	The full name of an internal user (your employee) or end-user.
Email	The email address of an internal user (your employee) or end-user.
Client IP	The IP address of your end-user associated with an incoming request or the external IP address of an outgoing request.
Credit card numbers	A Primary Account Number (PAN) used for financial transactions.
Database statements	The literal, sequence of literals, or bind variables used in an executed database statement.
Geographic location	Longitude and latitude coordinates that can be used to identify an individual or household.
URI parameters	The parameter values in the variable part of the URI path or the URI query.
URI userinfo	The userinfo subcomponent of the URI that may contain the user name.
Login ID	Can include an account/user ID, name, or email address.

The table below describes the default behavior of each language tracing library with regard to whether a data category is collected and whether it is obfuscated by default.

Note: Credit card numbers are obfuscated by the Datadog Agent by default.

Category	Collected	Obfuscated
Name
Email
Client IP
Credit card numbers
Database statements
Geographic location
URI parameters
URI userinfo
Login ID

Note: Database statements are not collected by default and must be enabled. Credit card numbers are obfuscated by the Datadog Agent by default.

Category	Collected	Obfuscated
Name
Email
Client IP
Credit card numbers
Database statements
Geographic location
URI parameters
URI userinfo
Login ID

Note: URI parameters are not collected by default and must be enabled. Credit card numbers are obfuscated by the Datadog Agent by default.

Category	Collected	Obfuscated
Name
Email
Client IP
Credit card numbers
Database statements
Geographic location
URI parameters
URI userinfo
Login ID

Note: Name and email are not collected by default and must be enabled. Credit card numbers are obfuscated by the Datadog Agent by default.

Category	Collected	Obfuscated
Name
Email
Client IP
Credit card numbers
Database statements
Geographic location
URI parameters
URI userinfo
Login ID

Note: Client IP, geographic location, and URI parameters are not collected by default and must be enabled. Credit card numbers are obfuscated by the Datadog Agent by default.

Category	Collected	Obfuscated
Name
Email
Client IP
Credit card numbers
Database statements
Geographic location
URI parameters
URI userinfo
Login ID

Note: Client IPs are not collected by default and must be enabled. Credit card numbers are obfuscated by the Datadog Agent by default.

Category	Collected	Obfuscated
Name
Email
Client IP
Credit card numbers
Database statements
Geographic location
URI parameters
URI userinfo
Login ID

Note: Client IPs are not collected by default and must be enabled. Database statements are obfuscated by the Datadog Agent. Credit card numbers are obfuscated by the Datadog Agent by default.

Category	Collected	Obfuscated
Name
Email
Client IP
Credit card numbers
Database statements
Geographic location
Client URI path
Client URI query string
Server URI path
Server URI query string
HTTP body
HTTP cookies
HTTP headers
Login ID

Note: Credit card numbers are obfuscated by the Datadog Agent by default.

Category	Collected	Obfuscated
Name
Email
Client IP
Credit card numbers
Database statements
Geographic location
Client URI path
Client URI query string
Server URI path
Server URI query string
HTTP body
HTTP cookies
HTTP headers
Login ID

Note: Credit card numbers are obfuscated by the Datadog Agent by default.

Category	Collected	Obfuscated
Name
Email
Client IP
Credit card numbers
Database statements
Geographic location
Client URI path
Client URI query string
Server URI path
Server URI query string
HTTP body
HTTP cookies
HTTP headers
Login ID

Note: Credit card numbers are obfuscated by the Datadog Agent by default.

Category	Collected	Obfuscated
Name
Email
Client IP
Credit card numbers
Database statements
Geographic location
Client URI path
Client URI query string
Server URI path
Server URI query string
HTTP body
HTTP cookies
HTTP headers
Login ID

If you use Datadog App and API Protection (AAP), the tracing libraries collect HTTP request data to help you understand the nature of a security trace. Datadog AAP automatically redacts certain data, and you can configure your own detection rules. Learn more about these defaults and configuration options in the Datadog AAP data privacy documentation.

Agent

Resource names

Datadog spans include a resource name attribute that may contain sensitive data. The Datadog Agent implements obfuscation of resource names for several known cases:

SQL numeric literals and bind variables are obfuscated: For example, the following query SELECT data FROM table WHERE key=123 LIMIT 10 is obfuscated to SELECT data FROM table WHERE key = ? LIMIT ? before setting the resource name for the query span.
SQL literal strings are identified using standard ANSI SQL quotes: This means strings should be surrounded in single quotes ('). Some SQL variants optionally support double-quotes (") for strings, but most treat double-quoted things as identifiers. The Datadog obfuscator treats these as identifiers rather than strings and does not obfuscate them.
Redis queries are quantized by selecting only command tokens: For example, the following query MULTI\nSET k1 v1\nSET k2 v2 is quantized to MULTI SET SET.

Trace obfuscation

The Datadog Agent also obfuscates sensitive trace data that is not within the resource name. You can configure the obfuscation rules using environment variables or the datadog.yaml configuration file.

The following metadata can be obfuscated:

MongoDB queries
ElasticSearch request bodies
Redis commands
MemCached commands
HTTP URLs
Stack traces
Credit card numbers

Note: Obfuscation can have a performance impact on your system, or could redact important information that is not sensitive. Consider what obfuscation you need for your setup, and customize your configuration appropriately.

Note: You can use automatic scrubbing for multiple types of services at the same time. Configure each in the obfuscation section of your datadog.yaml file.

MongoDB queries within a span of type mongodb are obfuscated by default.

apm_config:
  enabled: true

  ## (...)

  obfuscation:
    mongodb:
      ## Configures obfuscation rules for spans of type "mongodb". Enabled by default.
      enabled: true
      keep_values:
        - document_id
        - template_id
      obfuscate_sql_values:
        - val1

This can also be disabled with the environment variable DD_APM_OBFUSCATION_MONGODB_ENABLED=false.

keep_values or environment variable DD_APM_OBFUSCATION_MONGODB_KEEP_VALUES - defines a set of keys to exclude from Datadog Agent trace obfuscation. If not set, all keys are obfuscated.
obfuscate_sql_values or environment variable DD_APM_OBFUSCATION_MONGODB_OBFUSCATE_SQL_VALUES - defines a set of keys to include in Datadog Agent trace obfuscation. If not set, all keys are obfuscated.

ElasticSearch request bodies within a span of type elasticsearch are obfuscated by default.

apm_config:
  enabled: true

  ## (...)

  obfuscation:
    elasticsearch:
      ## Configures obfuscation rules for spans of type "elasticsearch". Enabled by default.
      enabled: true
      keep_values:
        - client_id
        - product_id
      obfuscate_sql_values:
        - val1

This can also be disabled with the environment variable DD_APM_OBFUSCATION_ELASTICSEARCH_ENABLED=false.

keep_values or environment variable DD_APM_OBFUSCATION_ELASTICSEARCH_KEEP_VALUES - defines a set of keys to exclude from Datadog Agent trace obfuscation. If not set, all keys are obfuscated.
obfuscate_sql_values or environment variable DD_APM_OBFUSCATION_ELASTICSEARCH_OBFUSCATE_SQL_VALUES - defines a set of keys to include in Datadog Agent trace obfuscation. If not set, all keys are obfuscated.

Redis commands within a span of type redis are obfuscated by default.

apm_config:
  enabled: true

  ## (...)

  obfuscation:
    ## Configures obfuscation rules for spans of type "redis". Enabled by default.
    redis:
      enabled: true
      remove_all_args: true

This can also be disabled with the environment variable DD_APM_OBFUSCATION_REDIS_ENABLED=false.

remove_all_args or environment variable DD_APM_OBFUSCATION_REDIS_REMOVE_ALL_ARGS - replaces all arguments of a redis command with a single “?” if true. Disabled by default.

MemCached commands within a span of type memcached are obfuscated by default.

apm_config:
  enabled: true

  ## (...)

  obfuscation:
    memcached:
      ## Configures obfuscation rules for spans of type "memcached". Enabled by default.
      enabled: true

This can also be disabled with the environment variable DD_APM_OBFUSCATION_MEMCACHED_ENABLED=false.

HTTP URLs within a span of type http or web are not obfuscated by default.

Note: Passwords within the Userinfo of a URL are not collected by Datadog.

apm_config:
  enabled: true

  ## (...)

  obfuscation:
    http:
      ## Enables obfuscation of query strings in URLs. Disabled by default.
      remove_query_string: true
      remove_paths_with_digits: true

remove_query_string or environment variable DD_APM_OBFUSCATION_HTTP_REMOVE_QUERY_STRING: If true, obfuscates query strings in URLs (http.url).
remove_paths_with_digits or environment variable DD_APM_OBFUSCATION_HTTP_REMOVE_PATHS_WITH_DIGITS: If true, path segments in URLs (http.url) containing one or more digits are replaced by “?”.

Disabled by default.

Set the remove_stack_traces parameter to true to remove stack traces and replace them with ?.

apm_config:
  enabled: true

  ## (...)

  obfuscation:
    ## Enables removing stack traces to replace them with "?". Disabled by default.
    remove_stack_traces: true # default false

This can also be enabled with the environment variable DD_APM_OBFUSCATION_REMOVE_STACK_TRACES=true.

Scans all span metadata for numbers that appear to be credit card numbers. Any values that match are replaced with ?. This check affects all span types and is enabled by default. Because this initial scan is based on patterns, it can sometimes cause false positives by redacting other long numbers. To improve accuracy, you can enable the credit_cards.luhn option described below.

Note: Scanning looks for values that are exactly credit card numbers (allowing for internal whitespace). If a metavalue has additional string data, this obfuscator determines that value is not a credit card number. For example:

A metavalue of 4111 1111 1111 1111 is redacted to ?.
A metavalue of CC-4111 1111 1111 1111 is not redacted.

apm_config:
  enabled: true

  ## (...)

  obfuscation:
    credit_cards:
      ## Enable obfuscation of suspected credit card values in meta fields.
      ## Enabled by default.
      enabled: true
      ## Enables a Luhn checksum check to reduce false positives.
      ## This option increases CPU usage.
      luhn: false
      ## A list of known safe keys to skip obfuscation.
      keep_values:
       - some_key_to_keep

credit_cards.enabled: Set to false to disable this obfuscator.
- Environment Variable: DD_APM_OBFUSCATION_CREDIT_CARDS_ENABLED
credit_cards.luhn: Set to true to enable a Luhn checksum check that validates numbers to eliminate false positives. This increases CPU usage and the performance cost of this check.
- Environment Variable: DD_APM_OBFUSCATION_CREDIT_CARDS_LUHN
credit_cards.keep_values: Set to a list of known safe keys to skip credit card obfuscation.
- Environment Variable: DD_APM_OBFUSCATION_CREDIT_CARDS_KEEP_VALUES

Replace tags

To scrub sensitive data from your span’s tags, use the replace_tags setting in your datadog.yaml configuration file or the DD_APM_REPLACE_TAGS environment variable. The value of the setting or environment variable is a list of one or more groups of parameters that specify how to replace sensitive data in your tags. These parameters are:

name: The key of the tag to replace. To match all tags, use *. To match the resource, use resource.name.
pattern: The regexp pattern to match against.
repl: The replacement string.

For example:

apm_config:
  replace_tags:
    # Replace all characters starting at the `token/` string in the tag "http.url" with "?"
    - name: "http.url"
      pattern: "token/(.*)"
      repl: "?"
    # Remove trailing "/" character in resource names
    - name: "resource.name"
      pattern: "(.*)\/$"
      repl: "$1"
    # Replace all the occurrences of "foo" in any tag with "bar"
    - name: "*"
      pattern: "foo"
      repl: "bar"
    # Remove all "error.stack" tag's value
    - name: "error.stack"
      pattern: "(?s).*"
    # Replace series of numbers in error messages
    - name: "error.message"
      pattern: "[0-9]{10}"
      repl: "[REDACTED]"

DD_APM_REPLACE_TAGS=[
      {
        "name": "http.url",
        "pattern": "token/(.*)",
        "repl": "?"
      },
      {
        "name": "resource.name",
        "pattern": "(.*)\/$",
        "repl": "$1"
      },
      {
        "name": "*",
        "pattern": "foo",
        "repl": "bar"
      },
      {
        "name": "error.stack",
        "pattern": "(?s).*"
      },
      {
        "name": "error.message",
        "pattern": "[0-9]{10}",
        "repl": "[REDACTED]"
      }
]

Set the DD_APM_REPLACE_TAGS environment variable:

For Datadog Operator, in override.nodeAgent.env in your datadog-agent.yaml
For Helm, in agents.containers.traceAgent.env in your datadog-values.yaml
For manual configuration, in the trace-agent container section of your manifest

- name: DD_APM_REPLACE_TAGS
  value: '[
            {
              "name": "http.url",
              "pattern": "token/(.*)",
              "repl": "?"
            },
            {
              "name": "resource.name",
              "pattern": "(.*)\/$",
              "repl": "$1"
            },
            {
              "name": "*",
              "pattern": "foo",
              "repl": "bar"
            },
            {
              "name": "error.stack",
              "pattern": "(?s).*"
            },
            {
              "name": "error.message",
              "pattern": "[0-9]{10}",
              "repl": "[REDACTED]"
            }
          ]'

Examples

Datadog Operator:

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  override:
    nodeAgent:
      env:
        - name: DD_APM_REPLACE_TAGS
          value: '[
                   {
                     "name": "http.url",
                  # (...)
                  ]'

Helm:

agents:
  containers:
    traceAgent:
      env:
        - name: DD_APM_REPLACE_TAGS
          value: '[
                   {
                     "name": "http.url",
                  # (...)
                  ]'

- DD_APM_REPLACE_TAGS=[{"name":"http.url","pattern":"token/(.*)","repl":"?"},{"name":"resource.name","pattern":"(.*)\/$","repl":"$1"},{"name":"*","pattern":"foo","repl":"bar"},{"name":"error.stack","pattern":"(?s).*"},{"name":"error.message","pattern":"[0-9]{10}","repl":"[REDACTED]"}]

Ignore resources

For an in depth overview of the options to avoid tracing specific resources, see Ignoring Unwanted Resources.

If your services include simulated traffic such as health checks, you may want to exclude these traces from being collected so the metrics for your services match production traffic.

The Agent can be configured to exclude a specific resource from traces sent by the Agent to Datadog. To prevent the submission of specific resources, use the ignore_resources setting in the datadog.yaml file . Then create a list of one or more regular expressions, specifying which resources the Agent filters out based on their resource name.

If you are running in a containerized environment, set DD_APM_IGNORE_RESOURCES on the container with the Datadog Agent instead. See the Docker APM Agent environment variables for details.

###### @param ignore_resources - list of strings - optional

###### A list of regular expressions can be provided to exclude certain traces based on their resource name.

###### All entries must be surrounded by double quotes and separated by commas.

###### ignore_resources: ["(GET|POST) /healthcheck","API::NotesController#index"]

Library

HTTP

Datadog is standardizing span tag semantics across tracing libraries. Information from HTTP requests are added as span tags prefixed with http.. The libraries have the following configuration options to control sensitive data collected in HTTP spans.

Redact query strings

The http.url tag is assigned the full URL value, including the query string. The query string could contain sensitive data, so by default Datadog parses it and redacts suspicious-looking values. This redaction process is configurable. To modify the regular expression used for redaction, set the DD_TRACE_OBFUSCATION_QUERY_STRING_REGEXP environment variable to a valid regex of your choice. Valid regex is platform-specific. When the regex finds a suspicious key-value pair, it replaces it with <redacted>.

If you do not want to collect the query string, set the DD_HTTP_SERVER_TAG_QUERY_STRING environment variable to false. The default value is true.

Collect headers

To collect trace header tags, set the DD_TRACE_HEADER_TAGS environment variable with a map of case-insensitive header keys to tag names. The library applies matching header values as tags on root spans. The setting also accepts entries without a specified tag name, for example:

DD_TRACE_HEADER_TAGS=CASE-insensitive-Header:my-tag-name,User-ID:userId,My-Header-And-Tag-Name

Processing

Some tracing libraries provide an interface for processing spans to manually modify or remove sensitive data collected in traces:

Telemetry collection

Datadog may gather environmental and diagnostic information about your tracing libraries for processing; this may include information about the host running an application, operating system, programming language and runtime, APM integrations used, and application dependencies. Additionally, Datadog may collect information such as diagnostic logs, crash dumps with obfuscated stack traces, and various system performance metrics.

You can disable this telemetry collection using either of these settings:

apm_config:
  telemetry:
    enabled: false

export DD_APM_TELEMETRY_ENABLED=false

Data Security

Overview

Personal information in trace data

Agent

Resource names

Trace obfuscation

Replace tags

Examples

Ignore resources

Library

HTTP

Redact query strings

Collect headers

Processing

Telemetry collection

Further Reading