Best Practices for Creating Custom Rules
Overview
Sensitive Data Scanner uses scanning rules to identify, tag, and optionally redact sensitive data in your logs, APM events, and RUM events. Use out-of-the-box scanning rules or create custom rules using regular expression (regex) patterns. This guide goes over best practices for creating custom rules using regex patterns.
Use precise regex patterns
Define regex patterns that are as precise as possible because generic patterns result in more false positives. To refine your regex pattern, add test data in the sample data tester when creating a custom rule. For more information, see step 2 in Add a custom scanning rule.
Refine regex pattern matching
Provide a list of keywords to the keyword dictionary to refine regex pattern matching. The dictionary checks for the matching pattern within a defined proximity of those keywords. For example, if you are scanning for passwords, you can add keywords like password
, token
, secret
, and credential
. You can also specify that these keywords be within a certain number of characters of a match. By default, keywords must be within 30 characters before a matched value. See step 2 in Add a custom scanning rule for more information.
To make matches more precise, you can also do one of the following:
- Scan the entire event but exclude certain attributes from getting scanned. For example, if you are scanning for personally identifiable information (PII) like names, you might want to exclude attributes such as
resource_name
and namespace
. - Scan for specific attributes to narrow the scope of the data that is scanned. For example, if you are scanning for names, you can choose specific attributes such as
first_name
and last_name
.
See step 3 in Add a custom scanning rule for more information.
Use out-of-the-box rules
Whenever possible, use Datadog’s out-of-the-box library rules. These rules are predefined rules that detect common patterns such as email addresses, credit card numbers, API keys, authorization tokens, network and device information, and more. Each rule has recommended keywords for the keyword dictionary to refine matching accuracy. You can also add your own keywords.
Contact support if there is a rule that you want to use and think other users would also benefit from it.
Further reading
Additional helpful documentation, links, and articles: