Overview
Datadog tracing libraries collect data from an instrumented application. That data is sent to Datadog as traces and it may contain sensitive data such as personally identifiable information (PII). If you are ingesting sensitive data as traces into Datadog, remediations can be added at ingestion with Sensitive Data Scanner. You can also configure the Datadog Agent or the tracing library to remediate sensitive data at collection before traces are sent to Datadog.
If the configurations described here do not cover your compliance requirements, reach out to the Datadog support team.
Datadog’s APM tracing libraries collect relevant observability data about your applications. Because these libraries collect hundreds of unique attributes in trace data, this page describes categories of data, with a focus on attributes that may contain personal information about your employees and end-users.
The table below describes the personal data categories collected by the automatic instrumentation provided by the tracing libraries, with some common examples listed.
Category | Description |
---|
Name | The full name of an internal user (your employee) or end-user. |
Email | The email address of an internal user (your employee) or end-user. |
Client IP | The IP address of your end-user associated with an incoming request or the external IP address of an outgoing request. |
Database statements | The literal, sequence of literals, or bind variables used in an executed database statement. |
Geographic location | Longitude and latitude coordinates that can be used to identify an individual or household. |
URI parameters | The parameter values in the variable part of the URI path or the URI query. |
URI userinfo | The userinfo subcomponent of the URI that may contain the user name. |
Login ID | Can include an account/user ID, name, or email address. |
The table below describes the default behavior of each language tracing library with regard to whether a data category is collected and whether it is obfuscated by default.
Category | Collected | Obfuscated |
---|
Name | | |
Email | | |
Client IP | | |
Database statements | | |
Geographic location | | |
URI parameters | | |
URI userinfo | | |
Login ID | | |
Note: Database statements are not collected by default and must be enabled.
Category | Collected | Obfuscated |
---|
Name | | |
Email | | |
Client IP | | |
Database statements | | |
Geographic location | | |
URI parameters | | |
URI userinfo | | |
Login ID | | |
Note: URI parameters are not collected by default and must be enabled.
Category | Collected | Obfuscated |
---|
Name | | |
Email | | |
Client IP | | |
Database statements | | |
Geographic location | | |
URI parameters | | |
URI userinfo | | |
Login ID | | |
Note: Name and email are not collected by default and must be enabled.
Category | Collected | Obfuscated |
---|
Name | | |
Email | | |
Client IP | | |
Database statements | | |
Geographic location | | |
URI parameters | | |
URI userinfo | | |
Login ID | | |
Note: Client IP, geographic location, and URI parameters are not collected by default and must be enabled.
Category | Collected | Obfuscated |
---|
Name | | |
Email | | |
Client IP | | |
Database statements | | |
Geographic location | | |
URI parameters | | |
URI userinfo | | |
Login ID | | |
Note: Client IPs are not collected by default and must be enabled.
Category | Collected | Obfuscated |
---|
Name | | |
Email | | |
Client IP | | |
Database statements | | |
Geographic location | | |
URI parameters | | |
URI userinfo | | |
Login ID | | |
Note: Client IPs are not collected by default and must be enabled. Database statements are obfuscated by the Datadog Agent.
Category | Collected | Obfuscated |
---|
Name | | |
Email | | |
Client IP | | |
Database statements | | |
Geographic location | | |
Client URI path | | |
Client URI query string | | |
Server URI path | | |
Server URI query string | | |
HTTP body | | |
HTTP cookies | | |
HTTP headers | | |
Login ID | | |
Category | Collected | Obfuscated |
---|
Name | | |
Email | | |
Client IP | | |
Database statements | | |
Geographic location | | |
Client URI path | | |
Client URI query string | | |
Server URI path | | |
Server URI query string | | |
HTTP body | | |
HTTP cookies | | |
HTTP headers | | |
Login ID | | |
Category | Collected | Obfuscated |
---|
Name | | |
Email | | |
Client IP | | |
Database statements | | |
Geographic location | | |
Client URI path | | |
Client URI query string | | |
Server URI path | | |
Server URI query string | | |
HTTP body | | |
HTTP cookies | | |
HTTP headers | | |
Login ID | | |
Category | Collected | Obfuscated |
---|
Name | | |
Email | | |
Client IP | | |
Database statements | | |
Geographic location | | |
Client URI path | | |
Client URI query string | | |
Server URI path | | |
Server URI query string | | |
HTTP body | | |
HTTP cookies | | |
HTTP headers | | |
Login ID | | |
If you use Datadog Application Security Management (ASM), the tracing libraries collect HTTP request data to help you understand the nature of a security trace. Datadog ASM automatically redacts certain data, and you can configure your own detection rules. Learn more about these defaults and configuration options in the Datadog ASM data privacy documentation.
Agent
Resource names
Datadog spans include a resource name attribute that may contain sensitive data. The Datadog Agent implements obfuscation of resource names for several known cases:
- SQL numeric literals and bind variables are obfuscated: For example, the following query
SELECT data FROM table WHERE key=123 LIMIT 10
is obfuscated to SELECT data FROM table WHERE key = ? LIMIT ?
before setting the resource name for the query span. - SQL literal strings are identified using standard ANSI SQL quotes: This means strings should be surrounded in single quotes (
'
). Some SQL variants optionally support double-quotes ("
) for strings, but most treat double-quoted things as identifiers. The Datadog obfuscator treats these as identifiers rather than strings and does not obfuscate them. - Redis queries are quantized by selecting only command tokens: For example, the following query
MULTI\nSET k1 v1\nSET k2 v2
is quantized to MULTI SET SET
.
Trace obfuscation
The Datadog Agent also obfuscates sensitive trace data that is not within the resource name. You can configure the obfuscation rules using environment variables or the datadog.yaml
configuration file.
The following metadata can be obfuscated:
- MongoDB queries
- ElasticSearch request bodies
- Redis commands
- MemCached commands
- HTTP URLs
- Stack traces
Note: Obfuscation can have a performance impact on your system, or could redact important information that is not sensitive. Consider what obfuscation you need for your setup, and customize your configuration appropriately.
Note: You can use automatic scrubbing for multiple types of services at the same time. Configure each in the obfuscation
section of your datadog.yaml
file.
MongoDB queries within a span of type mongodb
are obfuscated by default.
apm_config:
enabled: true
## (...)
obfuscation:
mongodb:
## Configures obfuscation rules for spans of type "mongodb". Enabled by default.
enabled: true
keep_values:
- document_id
- template_id
obfuscate_sql_values:
- val1
This can also be disabled with the environment variable DD_APM_OBFUSCATION_MONGODB_ENABLED=false
.
keep_values
or environment variable DD_APM_OBFUSCATION_MONGODB_KEEP_VALUES
- defines a set of keys to exclude from Datadog Agent trace obfuscation. If not set, all keys are obfuscated.obfuscate_sql_values
or environment variable DD_APM_OBFUSCATION_MONGODB_OBFUSCATE_SQL_VALUES
- defines a set of keys to include in Datadog Agent trace obfuscation. If not set, all keys are obfuscated.
ElasticSearch request bodies within a span of type elasticsearch
are obfuscated by default.
apm_config:
enabled: true
## (...)
obfuscation:
elasticsearch:
## Configures obfuscation rules for spans of type "elasticsearch". Enabled by default.
enabled: true
keep_values:
- client_id
- product_id
obfuscate_sql_values:
- val1
This can also be disabled with the environment variable DD_APM_OBFUSCATION_ELASTICSEARCH_ENABLED=false
.
keep_values
or environment variable DD_APM_OBFUSCATION_ELASTICSEARCH_KEEP_VALUES
- defines a set of keys to exclude from Datadog Agent trace obfuscation. If not set, all keys are obfuscated.obfuscate_sql_values
or environment variable DD_APM_OBFUSCATION_ELASTICSEARCH_OBFUSCATE_SQL_VALUES
- defines a set of keys to include in Datadog Agent trace obfuscation. If not set, all keys are obfuscated.
Redis commands within a span of type redis
are obfuscated by default.
apm_config:
enabled: true
## (...)
obfuscation:
## Configures obfuscation rules for spans of type "redis". Enabled by default.
redis:
enabled: true
remove_all_args: true
This can also be disabled with the environment variable DD_APM_OBFUSCATION_REDIS_ENABLED=false
.
remove_all_args
or environment variable DD_APM_OBFUSCATION_REDIS_REMOVE_ALL_ARGS
- replaces all arguments of a redis command with a single “?” if true. Disabled by default.
MemCached commands within a span of type memcached
are obfuscated by default.
apm_config:
enabled: true
## (...)
obfuscation:
memcached:
## Configures obfuscation rules for spans of type "memcached". Enabled by default.
enabled: true
This can also be disabled with the environment variable DD_APM_OBFUSCATION_MEMCACHED_ENABLED=false
.
HTTP URLs within a span of type http
or web
are not obfuscated by default.
Note: Passwords within the Userinfo of a URL are not collected by Datadog.
apm_config:
enabled: true
## (...)
obfuscation:
http:
## Enables obfuscation of query strings in URLs. Disabled by default.
remove_query_string: true
remove_paths_with_digits: true
remove_query_string
or environment variable DD_APM_OBFUSCATION_HTTP_REMOVE_QUERY_STRING
: If true, obfuscates query strings in URLs (http.url
).remove_paths_with_digits
or environment variable DD_APM_OBFUSCATION_HTTP_REMOVE_PATHS_WITH_DIGITS
: If true, path segments in URLs (http.url
) containing only digits are replaced by “?”.
Disabled by default.
Set the remove_stack_traces
parameter to true to remove stack traces and replace them with ?
.
apm_config:
enabled: true
## (...)
obfuscation:
## Enables removing stack traces to replace them with "?". Disabled by default.
remove_stack_traces: true # default false
This can also be enabled with the environment variable DD_APM_OBFUSCATION_REMOVE_STACK_TRACES=true
.
To scrub sensitive data from your span’s tags, use the replace_tags
setting in your datadog.yaml configuration file or the DD_APM_REPLACE_TAGS
environment variable. The value of the setting or environment variable is a list of one or more groups of parameters that specify how to replace sensitive data in your tags. These parameters are:
name
: The key of the tag to replace. To match all tags, use *
. To match the resource, use resource.name
.pattern
: The regexp pattern to match against.repl
: The replacement string.
For example:
apm_config:
replace_tags:
# Replace all characters starting at the `token/` string in the tag "http.url" with "?"
- name: "http.url"
pattern: "token/(.*)"
repl: "?"
# Remove trailing "/" character in resource names
- name: "resource.name"
pattern: "(.*)\/$"
repl: "$1"
# Replace all the occurrences of "foo" in any tag with "bar"
- name: "*"
pattern: "foo"
repl: "bar"
# Remove all "error.stack" tag's value
- name: "error.stack"
pattern: "(?s).*"
# Replace series of numbers in error messages
- name: "error.message"
pattern: "[0-9]{10}"
repl: "[REDACTED]"
DD_APM_REPLACE_TAGS=[
{
"name": "http.url",
"pattern": "token/(.*)",
"repl": "?"
},
{
"name": "resource.name",
"pattern": "(.*)\/$",
"repl": "$1"
},
{
"name": "*",
"pattern": "foo",
"repl": "bar"
},
{
"name": "error.stack",
"pattern": "(?s).*"
},
{
"name": "error.message",
"pattern": "[0-9]{10}",
"repl": "[REDACTED]"
}
]
Set the DD_APM_REPLACE_TAGS
environment variable:
- For Datadog Operator, in
override.nodeAgent.env
in your datadog-agent.yaml
- For Helm, in
agents.containers.traceAgent.env
in your datadog-values.yaml
- For manual configuration, in the
trace-agent
container section of your manifest
- name: DD_APM_REPLACE_TAGS
value: '[
{
"name": "http.url",
"pattern": "token/(.*)",
"repl": "?"
},
{
"name": "resource.name",
"pattern": "(.*)\/$",
"repl": "$1"
},
{
"name": "*",
"pattern": "foo",
"repl": "bar"
},
{
"name": "error.stack",
"pattern": "(?s).*"
},
{
"name": "error.message",
"pattern": "[0-9]{10}",
"repl": "[REDACTED]"
}
]'
Examples
Datadog Operator:
apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
name: datadog
spec:
override:
nodeAgent:
env:
- name: DD_APM_REPLACE_TAGS
value: '[
{
"name": "http.url",
# (...)
]'
Helm:
agents:
containers:
traceAgent:
env:
- name: DD_APM_REPLACE_TAGS
value: '[
{
"name": "http.url",
# (...)
]'
- DD_APM_REPLACE_TAGS=[{"name":"http.url","pattern":"token/(.*)","repl":"?"},{"name":"resource.name","pattern":"(.*)\/$","repl":"$1"},{"name":"*","pattern":"foo","repl":"bar"},{"name":"error.stack","pattern":"(?s).*"},{"name":"error.message","pattern":"[0-9]{10}","repl":"[REDACTED]"}]
Ignore resources
For an in depth overview of the options to avoid tracing specific resources, see Ignoring Unwanted Resources.
If your services include simulated traffic such as health checks, you may want to exclude these traces from being collected so the metrics for your services match production traffic.
The Agent can be configured to exclude a specific resource from traces sent by the Agent to Datadog. To prevent the submission of specific resources, use the ignore_resources
setting in the datadog.yaml
file . Then create a list of one or more regular expressions, specifying which resources the Agent filters out based on their resource name.
If you are running in a containerized environment, set DD_APM_IGNORE_RESOURCES
on the container with the Datadog Agent instead. See the Docker APM Agent environment variables for details.
###### @param ignore_resources - list of strings - optional
###### A list of regular expressions can be provided to exclude certain traces based on their resource name.
###### All entries must be surrounded by double quotes and separated by commas.
###### ignore_resources: ["(GET|POST) /healthcheck","API::NotesController#index"]
Library
HTTP
Datadog is standardizing span tag semantics across tracing libraries. Information from HTTP requests are added as span tags prefixed with http.
. The libraries have the following configuration options to control sensitive data collected in HTTP spans.
Redact query strings
The http.url
tag is assigned the full URL value, including the query string. The query string could contain sensitive data, so by default Datadog parses it and redacts suspicious-looking values. This redaction process is configurable. To modify the regular expression used for redaction, set the DD_TRACE_OBFUSCATION_QUERY_STRING_REGEXP
environment variable to a valid regex of your choice. Valid regex is platform-specific. When the regex finds a suspicious key-value pair, it replaces it with <redacted>
.
If you do not want to collect the query string, set the DD_HTTP_SERVER_TAG_QUERY_STRING
environment variable to false
. The default value is true
.
To collect trace header tags, set the DD_TRACE_HEADER_TAGS
environment variable with a map of case-insensitive header keys to tag names. The library applies matching header values as tags on root spans. The setting also accepts entries without a specified tag name, for example:
DD_TRACE_HEADER_TAGS=CASE-insensitive-Header:my-tag-name,User-ID:userId,My-Header-And-Tag-Name
Processing
Some tracing libraries provide an interface for processing spans to manually modify or remove sensitive data collected in traces:
Telemetry collection
Datadog may gather environmental and diagnostic information about your tracing libraries for processing; this may include information about the host running an application, operating system, programming language and runtime, APM integrations used, and application dependencies. Additionally, Datadog may collect information such as diagnostic logs, crash dumps with obfuscated stack traces, and various system performance metrics.
You can disable this telemetry collection using either of these settings:
apm_config:
telemetry:
enabled: false
export DD_APM_TELEMETRY_ENABLED=false
PCI DSS compliance for compliance for APM
PCI compliance for APM is only available for Datadog organizations in the
US1 site.
To set up a PCI-compliant Datadog org, follow these steps:
To set up PCI compliant Application Performance Monitoring, you must meet the following requirements:
- Audit Trail must be enabled and remain enabled for PCI DSS compliance. If you haven’t already enabled Audit Trail, it is automatically enabled once the org is configured as PCI-compliant (after following the steps below).
- Your Datadog organization is in the US1 site.
- All spans sent to the PCI endpoints using HTTPS only. If you are using the Agent to send spans, you should enforce HTTPS transport.
- All your spans endpoints need to be changed to the PCI endpoints for spans.
- You may request access to the PCI Attestation of Compliance and Customer Responsibility Matrix on Datadog’s Trust Center - note that these documents are only applicable once you have finished all the onboarding steps and have been manually configured to be compliant by Datadog support.
To begin onboarding:
- Contact Datadog support or your Customer Success Manager to request to being the PCI onboarding process while ensuring the necessary PCI requirements are met.
- After Datadog support or Customer Success confirms that the org is PCI DSS compliant, configure the respective configuration file to send spans to the dedicated PCI compliant endpoint:
https://trace-pci.agent.datadoghq.com
for Agent and non-Agent traffic
- For example, add the following lines to the Agent configuration file:
apm_config:
apm_dd_url: <https://trace-pci.agent.datadoghq.com>
- All spans that are sent to the PCI compliant endpoint(s) automatically have a set of Sensitive Data Scanner PCI rules that are applied to scrub any cardholder data. These dedicated PCI rules must be enalbed for PCI DSS compliance and are included with no additional charge.
To finish onboarding and be moved to compliant:
- Inform your Datadog support or your Customer Success Manager that you have moved over all your endpoints to the PCI compliant endpoint(s).
- Once confirmed by Datadog, your span configuration and Application Performance Monitoring is considered PCI-compliant.
If you have any questions about how your now PCI-compliant Application Performance Monitoring satisfies the applicable requirements under PCI DSS, contact your account manager. See information on setting up PCI-compliant Log Management.
See PCI DSS Compliance for more information. To enable PCI compliance for logs, see PCI DSS compliance for Log Management.
PCI compliance for APM is not available for the site.
Further Reading
Additional helpful documentation, links, and articles: