Note: Agent version 6.0 and above only support versions of Kubernetes higher than 1.7.6. For prior versions of Kubernetes, use Agent 5.x.
There are two installation processes available to gather metrics, traces and logs from your Kubernetes Clusters:
Installing the Agent on the host as opposed to in a pod as part of a Deployment or a Daemonset would not benefit the observability of the lifecycle of your Kubernetes cluster.
It could however help give visibility over the start of the Kubernetes ecosystem and health thereof. Similarly, one would not be restricted to monitoring applications belonging to the Kubernetes eco system.
To discover all data collected automatically from the Kubernetes integration, refer to the dedicated Kubernetes Integration Documentation.
This documentation is for Agent v6 only, if you are still using Agent v5, follow this installation process
Take advantage of DaemonSets to automatically deploy the Datadog Agent on all your nodes (or on specific nodes by using nodeSelectors).
If DaemonSets are not an option for your Kubernetes cluster, install the Datadog Agent as a sidecar container on each Kubernetes node.
If your Kubernetes has RBAC enabled, see the documentation on how to configure RBAC permissions with your Datadog-Kubernetes integration.
apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: datadog-agent spec: template: metadata: labels: app: datadog-agent name: datadog-agent spec: serviceAccountName: datadog-agent containers: - image: datadog/agent:latest imagePullPolicy: Always name: datadog-agent ports: - containerPort: 8125 name: dogstatsdport protocol: UDP - containerPort: 8126 name: traceport protocol: TCP env: - name: DD_API_KEY value: <YOUR_API_KEY> - name: DD_COLLECT_KUBERNETES_EVENTS value: "true" - name: DD_LEADER_ELECTION value: "true" - name: KUBERNETES value: "yes" - name: DD_KUBERNETES_KUBELET_HOST valueFrom: fieldRef: fieldPath: status.hostIP resources: requests: memory: "128Mi" cpu: "100m" limits: memory: "512Mi" cpu: "250m" volumeMounts: - name: dockersocket mountPath: /var/run/docker.sock - name: procdir mountPath: /host/proc readOnly: true - name: cgroups mountPath: /host/sys/fs/cgroup readOnly: true livenessProbe: exec: command: - ./probe.sh initialDelaySeconds: 15 periodSeconds: 5 volumes: - hostPath: path: /var/run/docker.sock name: dockersocket - hostPath: path: /proc name: procdir - hostPath: path: /sys/fs/cgroup name: cgroups
kubectl create -f datadog-agent.yaml
Note: This manifest enables autodiscovery’s auto configuration feature. To learn how to configure autodiscovery, please refer to its documentation.
To enable Log collection with your DaemonSet:
DD_LOGS_ENABLED variable to true in your env section:
(...) env: (...) - name: DD_LOGS_ENABLED value: "true" (...)
pointdir volume in volumeMounts:
(...) volumeMounts: (...) - name: pointerdir mountPath: /opt/datadog-agent/run (...) volumes: (...) - hostPath: path: /opt/datadog-agent/run name: pointerdir (...)
Learn more about this in the Docker log collection documentation.
In the context of using the Kubernetes integration, and when deploying Agents in a Kubernetes cluster, a set of rights are required for the Agent to integrate seamlessly.
You will need to allow the Agent to be allowed to perform a few actions:
datadogtokento update and query the most up to date version token corresponding to the latest event stored in ETCD.
Eventsto pull the events from the API Server, format and submit them.
Endpoint. The Endpoint used by the Agent for the Leader election feature is named
componentstatusesresource, in order to submit service checks for the Controle Plane’s components status.
You can find the templates in manifests/rbac here. This will create the Service Account in the default namespace, a Cluster Role with the above rights and the Cluster Role Binding.
It is possible to leverage the ConfigMaps to configure or enable integrations. To do so, you only need to create a ConfigMap with the integration(s)’s configuration. Then, reference this ConfigMap among the volumes of your Agent’s manifest.
For example, in the following case we customize the name, url and tags fields of the http check. To enable other integrations, just specify the correct yaml name and make sure it is properly formated.
kind: ConfigMap apiVersion: v1 metadata: name: dd-agent-config namespace: default data: http-config: |- init_config: instances: - name: My service url: my.service:port/healthz tags: - service:critical ---
And in the manifest of your Agent (Daemonset/Deployment) add the following:
[...] volumeMounts: [...] - name: dd-agent-config mountPath: /conf.d volumes: [...] - name: dd-agent-config configMap: name: dd-agent-config items: - key: http-config path: http_check.yaml [...]
To enable Log collection add the following lines in your
(...) data: http-config: |- (...) logs: - type: docker service: docker source: kubernetes
Learn more about this in the Docker log collection documentation.
It is also possible to enable integrations via the annotations in the manifest of your application. This can be done with the autodiscovery, for more details, see the Autodiscovery section.
Install the latest version of the Datadog Agent from the Datadog Agent integration page
Enable the kubelet check & optionally the docker check if your kubernetes is using the docker runtime:
mv /etc/datadog-agent/conf.d/kubelet.d/conf.yaml.example /etc/datadog-agent/conf.d/kubelet.d/conf.yaml.default
datadog.yaml file to point to activate autodiscovery features for the kubelet (discovery through annotations):
config_providers: - name: kubelet polling: true listeners: - name: kubelet
You may now start/restart the Agent to enable the new configuration settings.
Optionally if you’re using the docker runtime on your cluster you might want to activate the docker check as well:
mv /etc/datadog-agent/conf.d/docker.d/conf.yaml.example /etc/datadog-agent/conf.d/docker.d/conf.yaml.default
For the docker check to run properly you’ll need to add the
dd-agent user to the docker group using
adduser dd-agent docker
To verify the Datadog Agent is running in your environment as a daemonset, execute:
kubectl get daemonset
If the Agent is deployed you will see output similar to the text below, where desired and current are equal to the number of nodes running in your cluster.
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE datadog-agent 2 2 2 2 2 <none> 16h
Run the Agent’s
status subcommand and look for
kubelet under the Checks section:
Checks ====== kubelet ----------- - instance #0 [OK] - Collected 39 metrics, 0 events & 7 service checks
Similarly to the Agent 5, the Agent 6 can collect events from the Kubernetes API server.
First and foremost, you need to set the
collect_kubernetes_events variable to
true in the datadog.yaml, this can be achieved via the environment variable
DD_COLLECT_KUBERNETES_EVENTS that is resolved at start time.
You will need to give the Agent some rights to activate this feature. See the RBAC section.
A ConfigMap can be used to store the
event.tokenKey and the
event.tokenTimestamp. It has to be deployed in the
default namespace and be named
One can simply run
kubectl create configmap datadogtoken --from-literal="event.tokenKey"="0" . You can also use the example in manifests/datadog_configmap.yaml.
When the ConfigMap is used, if the Agent in charge (via the Leader election) of collecting the events dies, the next leader elected will use the ConfigMap to identify the last events pulled. This is in order to avoid duplicate the events collected, as well as putting less stress on the API Server.
The Datadog Agent6 supports built in leader election option for the Kubernetes event collector and the Kubernetes cluster related checks (i.e. Controle Plane service check).
This feature relies on Endpoints, you can enable it by setting the
DD_LEADER_ELECTION environment variable to
true the Datadog Agents will need to have a set of actions allowed prior to its deployment nevertheless.
See the RBAC section for more details and keep in mind that these RBAC entities will need to be created before the option is set.
Agents coordinate by performing a leader election among members of the Datadog DaemonSet through kubernetes to ensure only one leader Agent instance is gathering events at a given time.
This functionality is disabled by default, enabling the event collection will activate it to avoid duplicating collecting events and stress on the API server.
The leaderLeaseDuration is the duration for which a leader stays elected. It should be > 30 seconds and is 60 seconds by default. The longer it is, the less frequently your Agents hit the apiserver with requests, but it also means that if the leader dies (and under certain conditions), events can be missed until the lease expires and a new leader takes over.
It can be configured with the environment variable