Overview
The Datadog Agent allows you to securely manage secrets by integrating with any external secrets management solution (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault, a custom solution…). Instead of hardcoding sensitive values like API keys or passwords in plaintext within configuration files, the Agent can retrieve them dynamically at runtime.
How it works
To reference a secret in your configuration, use the ENC[<secret_id>]
notation. This tells the Agent to resolve the value using your configured secret retrieval executable. The secret is fetched and loaded into memory but is never written to disk or sent to the Datadog backend.
For example, the following configuration shows two secrets defined with ENC[]
:
instances:
- server: db_prod
user: "ENC[db_prod_user]"
password: "ENC[db_prod_password]"
The secret handle must make up the full value of the YAML field and is always resolved as strings. This means configurations like password: "db-ENC[prod_password]"
are not be recognized as secrets.
You can use any characters inside the ENC[]
brackets as long as the YAML is valid. If your secret ID includes special characters or is a JSON string, make sure to properly escape it. For example:
"ENC[{\"env\": \"prod\", \"check\": \"postgres\", \"id\": \"user_password\"}]"
It’s also possible to use Autodiscovery variables in secret handles. The Agent resolves these variables before resolving the secret. For example:
instances:
- server: %%host%%
user: ENC[db_prod_user_%%host%%]
password: ENC[db_prod_password_%%host%%]
However, note that you cannot use the ENC[]
syntax in secret_*
settings like secret_backend_command
.
Agent security requirements
The Agent runs the provided executable as a sub-process. The execution patterns differ on Linux and Windows.
On Linux, the executable must:
- Belong to the same user running the Agent (
dd-agent
by default, or root
inside a container). - Have no rights for
group
or other
. - Have at least exec rights for the owner.
On Windows, the executable must:
- Have read/exec for
ddagentuser
(the user used to run the Agent). - Have no rights for any user or group except for the
Administrators
group, the built-in Local System
account, or the Agent user context (ddagentuser
by default) - Be a valid Win32 application so the Agent can execute it (a PowerShell or Python script would not work for example).
Note: The executable shares the same environment variables as the Agent.
Providing a secret retrieval executable
To retrieve secrets, the Agent uses an external executable that you provide. The executable is used when new
secrets are discovered and are cached for the lifecycle of the Agent. If you need to update or rotate a secret, you must restart the Agent to reload it.
The Agent sends this executable a JSON payload over standard input containing a list of secret handles. The executable fetches each secret and returns them in a JSON format through standard output.
Here’s what the Agent sends to your executable on STDIN:
{
"version": "1.0",
"secrets": ["secret1", "secret2"]
}
version
(string): the format version.secrets
(list of strings): each string is a handle for a secret to fetch.
The executable should respond through STDOUT:
{
"secret1": {"value": "decrypted_value", "error": null},
"secret2": {"value": null, "error": "could not fetch the secret"}
}
value
(string): the secret value to be used in the configurations. This can be null
in the case of an error.error
(string): an error message or null
.
If a secret fails to resolve (either by returning a non-zero exit code or a non-null error), the related configuration is ignored by the Agent.
Never output sensitive information on stderr
. If the binary exits with a different status code than 0
, the Agent logs the standard error output of the executable to ease troubleshooting.
Options for retrieving secrets
Option 1: Using the built-in Script for Kubernetes and Docker
For containerized environments, the Datadog Agent’s container images include a built-in script /readsecret_multiple_providers.sh
starting with version v7.32.0. This script supports reading secrets from:
- Files: using
ENC[file@/path/to/file]
- Kubernetes Secrets: using
ENC[k8s_secret@namespace/secret-name/key]
To use this executable with the Helm chart, set it as the following:
datadog:
[...]
secretBackend:
command: "/readsecret_multiple_providers.sh"
To use this executable, set the environment variable DD_SECRET_BACKEND_COMMAND
as follows:
DD_SECRET_BACKEND_COMMAND=/readsecret_multiple_providers.sh
Example: Reading from mounted files
Kubernetes supports exposing Secrets as files inside a pod that the Agent can read to resolve secrets.
In Kubernetes, you can mount a Secret as a volume like this:
containers:
- name: agent
#(...)
volumeMounts:
- name: secret-volume
mountPath: /etc/secret-volume
#(...)
volumes:
- name: secret-volume
secret:
secretName: test-secret
You can then reference the secret like this:
password: ENC[file@/etc/secret-volume/password]
Notes:
- The Secret must exist in the same namespace as the pod it is being mounted in.
- The script is able to access all subfolders, including the sensitive
/var/run/secrets/kubernetes.io/serviceaccount/token
. As such, Datadog recommends using a dedicated folder instead of /var/run/secrets
.
Docker swarm secrets are mounted in the /run/secrets
folder. For example, the Docker secret db_prod_passsword
is located in /run/secrets/db_prod_password
in the Agent container. This would be referenced in the configuration with ENC[file@/run/secrets/db_prod_password]
.
Example: Reading a Kubernetes Secret Across Namespaces
If you want the Agent to read a Secret from a different namespace, use the k8s_secret@
prefix. For example:
password: ENC[k8s_secret@database/database-secret/password]
In this case, you must manually configure RBAC to allow the Agent’s Service Account to read the Secret:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: datadog-secret-reader
namespace: database
rules:
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["database-secret"]
verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: datadog-read-secrets
namespace: database
subjects:
- kind: ServiceAccount
name: datadog-agent
apiGroup: ""
namespace: default
roleRef:
kind: Role
name: datadog-secret-reader
apiGroup: ""
This Role
gives access to the Secret: database-secret
in the Namespace: database
. The RoleBinding
links up this permission to the ServiceAccount: datadog-agent
in the Namespace: default
. This needs to be manually added to your cluster with respect to your resources deployed.
Option 2: Using a prebuilt executable
If you’re using a standard secrets provider like AWS Secrets Manager
, AWS SSM
or other, you can use the prebuilt datadog-secret-backend executable.
Here’s an example showing how to set it up:
- Create your secret in AWS Secrets Manager. The secrets
ARN
in AWS is the secrets handle. example:
arn:aws:secretsmanager:us-east-2:111122223333:secret:AgentAPIKey
Grant your EC2 instance IAM permissions to read the secret:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"secretsmanager:GetSecretValue"
],
"Resource": [
"arn:aws:secretsmanager:us-east-2:111122223333:secret:AgentAPIKey"
]
}
]
}
Download the binary to your EC2 instance and configure it at /datadog-secret-backend.yaml:
Download the latest release of datadog-secret-backend
on your EC2 instance and create its configuration datadog-secret-backend.yaml
next to the binary (more information at this link):
backends:
staging-aws:
backend_type: aws.secrets
We are configuring one backend of type aws.secrets
under the name staging-aws
Set the correct access rights for the binary as described in Agent security requirements:
$> chown dd-agent:dd-agent datadog-secret-backend
$> chmod 500 datadog-secret-backend
Configure the Agent to use the binary to resolve secrets and use the AWS secret (here as the api_key
):
api_key: ENC[staging-aws:arn:aws:secretsmanager:us-east-2:111122223333:secret:AgentAPIKey]
secret_backend_command: /path/to/datadog-secret-backend
The staging-aws:
prefix matches the key defined in your backend configuration.
Restart the Agent.
You can see which secrets the Agent has resolved by running the datadog-agent secrets
command locally on your EC2 instance.
Option 3: Creating your own custom executable
You can also build your own secret retrieval executable using any language. The only requirement is that it follows the input/output format described previously.
Here is a Go example that returns dummy secrets:
package main
import (
"encoding/json"
"fmt"
"io/ioutil"
"os"
)
type secretsPayload struct {
Secrets []string `json:secrets`
Version int `json:version`
}
func main() {
data, err := ioutil.ReadAll(os.Stdin)
if err != nil {
fmt.Fprintf(os.Stderr, "Could not read from stdin: %s", err)
os.Exit(1)
}
secrets := secretsPayload{}
json.Unmarshal(data, &secrets)
res := map[string]map[string]string{}
for _, handle := range secrets.Secrets {
res[handle] = map[string]string{
"value": "decrypted_" + handle,
}
}
output, err := json.Marshal(res)
if err != nil {
fmt.Fprintf(os.Stderr, "could not serialize res: %s", err)
os.Exit(1)
}
fmt.Printf(string(output))
}
This transforms your configuration:
instances:
- server: db_prod
user: ENC[db_prod_user]
password: ENC[db_prod_password]
Into the following in memory:
instances:
- server: db_prod
user: decrypted_db_prod_user
password: decrypted_db_prod_password
Refreshing API/APP keys at runtime
Starting in Agent version v7.67, you can configure the Agent to refresh its API and APP keys at regular intervals without requiring a restart. This relies on the API key and APP key being pulled as secrets.
To enable this, set secret_refresh_interval
(in seconds) in your datadog.yaml
file:
api_key: ENC[<secret_handle>]
secret_backend_command: /path/to/your/executable
secret_refresh_interval: 3600 # refresh every hour
By default the Agent randomly spreads its first refresh within the specified secret_refresh_interval
window. This
means that it resolves the API key at startup, then refreshes it within the first interval and every interval after that.
This avoids having a fleet of Agents refreshing their API/APP key at the same time.
You can disable this behavior by setting:
secret_refresh_scatter: false
To refresh manually, use:
datadog-agent secret refresh
Troubleshooting
Listing detected secrets
The secret
command in the Agent CLI shows any errors related to your setup. For example, if the rights on the executable are incorrect. It also lists all handles found, and where they are located.
On Linux, the command outputs file mode, owner and group for the executable. On Windows, ACL rights are listed.
Example on Linux:
$> datadog-agent secret
=== Checking executable rights ===
Executable path: /path/to/you/executable
Check Rights: OK, the executable has the correct rights
Rights Detail:
file mode: 100700
Owner username: dd-agent
Group name: dd-agent
=== Secrets stats ===
Number of secrets decrypted: 3
Secrets handle decrypted:
- api_key: from datadog.yaml
- db_prod_user: from postgres.yaml
- db_prod_password: from postgres.yaml
Example on Windows (from an Administrator PowerShell):
PS C:\> & "$env:ProgramFiles\Datadog\Datadog Agent\bin\agent.exe" secret
=== Checking executable rights ===
Executable path: C:\path\to\you\executable.exe
Check Rights: OK, the executable has the correct rights
Rights Detail:
Acl list:
stdout:
Path : Microsoft.PowerShell.Core\FileSystem::C:\path\to\you\executable.exe
Owner : BUILTIN\Administrators
Group : WIN-ITODMBAT8RG\None
Access : NT AUTHORITY\SYSTEM Allow FullControl
BUILTIN\Administrators Allow FullControl
WIN-ITODMBAT8RG\ddagentuser Allow ReadAndExecute, Synchronize
Audit :
Sddl : O:BAG:S-1-5-21-2685101404-2783901971-939297808-513D:PAI(A;;FA;;;SY)(A;;FA;;;BA)(A;;0x1200
a9;;;S-1-5-21-2685101404-2783901971-939297808-1001)
=== Secrets stats ===
Number of secrets decrypted: 3
Secrets handle decrypted:
- api_key: from datadog.yaml
- db_prod_user: from sqlserver.yaml
- db_prod_password: from sqlserver.yaml
Seeing configurations after secrets were injected
To quickly see how the check’s configurations are resolved, you can use the configcheck
command:
sudo -u dd-agent -- datadog-agent configcheck
=== a check ===
Source: File Configuration Provider
Instance 1:
host: <decrypted_host>
port: <decrypted_port>
password: <obfuscated_password>
~
===
=== another check ===
Source: File Configuration Provider
Instance 1:
host: <decrypted_host2>
port: <decrypted_port2>
password: <obfuscated_password2>
~
===
Note: The Agent needs to be restarted to pick up changes on configuration files.
Debugging your secret_backend_command
To test or debug outside of the Agent, you can mimic how the Agent runs it:
Linux
sudo -u dd-agent bash -c "echo '{\"version\": \"1.0\", \"secrets\": [\"secret1\", \"secret2\"]}' | /path/to/the/secret_backend_command"
The dd-agent
user is created when you install the Datadog Agent.
Windows
If you encounter one of the following errors, then something is missing in your setup. See the Windows instructions.
If any other group or user than needed has rights on the executable, a similar error to the following is logged:
error while decrypting secrets in an instance: Invalid executable 'C:\decrypt.exe': other users/groups than LOCAL_SYSTEM, Administrators or ddagentuser have rights on it
If ddagentuser
doesn’t have read and execute right on the file, a similar error logged:
error while decrypting secrets in an instance: could not query ACLs for C:\decrypt.exe
Your executable needs to be a valid Win32 application. If not, the following error is logged:
error while running 'C:\decrypt.py': fork/exec C:\decrypt.py: %1 is not a valid Win32 application.
Datadog has a Powershell script to help you set the correct permission on your executable. Example on how to use it:
.\Set-SecretPermissions.ps1 -SecretBinaryPath C:\secrets\decrypt_secrets.exe
ddagentuser SID: S-1-5-21-3139760116-144564943-2741514060-1076
=== Checking executable permissions ===
Executable path: C:\secrets\decrypt_secrets.exe
Executable permissions: OK, the executable has the correct permissions
Permissions Detail:
stdout:
Path : Microsoft.PowerShell.Core\FileSystem::C:\secrets\decrypt_secrets.exe
Owner : BUILTIN\Administrators
Group : BUILTIN\Administrators
Access : NT AUTHORITY\SYSTEM Allow FullControl
BUILTIN\Administrators Allow FullControl
DESKTOP-V03BB2P\ddagentuser Allow ReadAndExecute, Synchronize
Audit :
Sddl : O:BAG:BAD:PAI(A;;FA;;;SY)(A;;FA;;;BA)(A;;0x1200a9;;;S-1-5-21-3139760116-144564943-2741514
060-1076)
stderr:
=== Secrets stats ===
Number of secrets resolved: 0
Secrets handle resolved:
Testing your executable
Your executable is executed by the Agent when fetching your secrets. The Datadog Agent runs using the ddagentuser
. This user has no specific rights, but it is part of the Performance Monitor Users
group. The password for this user is randomly generated at install time and is never saved anywhere.
This means that your executable might work with your default user or development user—but not when it’s run by the Agent, since ddagentuser
has more restricted rights.
To test your executable in the same conditions as the Agent, update the password of the ddagentuser
on your dev box. This way, you can authenticate as ddagentuser
and run your executable in the same context the Agent would.
To do so, follow those steps:
- Remove
ddagentuser
from the Local Policies/User Rights Assignement/Deny Log on locally
list in the Local Security Policy
. - Set a new password for
ddagentuser
(since the one generated at install time is never saved anywhere). In PowerShell, run:$user = [ADSI]"WinNT://./ddagentuser";
$user.SetPassword("a_new_password")
- Update the password to be used by
DatadogAgent
service in the Service Control Manager. In PowerShell, run:sc.exe config DatadogAgent password= "a_new_password"
You can now login as ddagentuser
to test your executable. Datadog has a Powershell script to help you test your
executable as another user. It switches user contexts and mimics how the Agent runs your executable.
Example on how to use it:
.\secrets_tester.ps1 -user ddagentuser -password a_new_password -executable C:\path\to\your\executable.exe -payload '{"version": "1.0", "secrets": ["secret_ID_1", "secret_ID_2"]}'
Creating new Process with C:\path\to\your\executable.exe
Waiting a second for the process to be up and running
Writing the payload to Stdin
Waiting a second so the process can fetch the secrets
stdout:
{"secret_ID_1":{"value":"secret1"},"secret_ID_2":{"value":"secret2"}}
stderr: None
exit code:
0
Agent refusing to start
The first thing the Agent does on startup is to load datadog.yaml
and decrypt any secrets in it. This is done before setting up the logging. This means that on platforms like Windows, errors occurring when loading datadog.yaml
aren’t written in the logs, but on stderr
. This can occur when the executable given to the Agent for secrets returns an error.
If you have secrets in datadog.yaml
and the Agent refuses to start:
- Try to start the Agent manually to be able to see
stderr
. - Remove the secrets from
datadog.yaml
and test with secrets in a check configuration file first.
Testing Kubernetes Permissions
When reading Secrets directly from Kubernetes you can double check your permissions with the kubectl auth
command. The general form of this is:
kubectl auth can-i get secret/<SECRET_NAME> -n <SECRET_NAMESPACE> --as system:serviceaccount:<AGENT_NAMESPACE>:<AGENT_SERVICE_ACCOUNT>
Consider the previous Kubernetes Secrets example, where the Secret Secret:database-secret
exists in the Namespace: database
, and the Service Account ServiceAccount:datadog-agent
exists in the Namespace: default
.
In this case, use the following command:
kubectl auth can-i get secret/database-secret -n database --as system:serviceaccount:default:datadog-agent
This command returns whether the permissions are valid for the Agent to view this Secret.
Further Reading
Additional helpful documentation, links, and articles: