Logging is here!

Kubernetes

Note: Agent version 6.0 and above only support versions of Kubernetes higher than 1.7.6. For prior versions of Kubernetes, use Agent 5.x.

There are two installation processes available to gather metrics, traces and logs from your Kubernetes Clusters:

Installing the Agent on the host as opposed to in a pod as part of a Deployment or a Daemonset would not benefit the observability of the lifecycle of your Kubernetes cluster.

It could however help give visibility over the start of the Kubernetes ecosystem and health thereof. Similarly, one would not be restricted to monitoring applications belonging to the Kubernetes eco system.

To discover all data collected automatically from the Kubernetes integration, refer to the dedicated Kubernetes Integration Documentation.

This documentation is for Agent v6 only, if you are still using Agent v5, follow this installation process

Container installation

Setup

Take advantage of DaemonSets to automatically deploy the Datadog Agent on all your nodes (or on specific nodes by using nodeSelectors).

If DaemonSets are not an option for your Kubernetes cluster, install the Datadog Agent as a sidecar container on each Kubernetes node.

If your Kubernetes has RBAC enabled, see the documentation on how to configure RBAC permissions with your Datadog-Kubernetes integration.

  • Create the following datadog-agent.yaml manifest:
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: datadog-agent
spec:
  template:
    metadata:
      labels:
        app: datadog-agent
      name: datadog-agent
    spec:
      serviceAccountName: datadog-agent
      containers:
      - image: datadog/agent:latest
        imagePullPolicy: Always
        name: datadog-agent
        ports:
          - containerPort: 8125
            name: dogstatsdport
            protocol: UDP
          - containerPort: 8126
            name: traceport
            protocol: TCP
        env:
          - name: DD_API_KEY
            value: <YOUR_API_KEY>
          - name: DD_COLLECT_KUBERNETES_EVENTS
            value: "true"
          - name: DD_LEADER_ELECTION
            value: "true"
          - name: KUBERNETES
            value: "yes"
          - name: DD_KUBERNETES_KUBELET_HOST
            valueFrom:
              fieldRef:
                fieldPath: status.hostIP
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "250m"
        volumeMounts:
          - name: dockersocket
            mountPath: /var/run/docker.sock
          - name: procdir
            mountPath: /host/proc
            readOnly: true
          - name: cgroups
            mountPath: /host/sys/fs/cgroup
            readOnly: true
        livenessProbe:
          exec:
            command:
            - ./probe.sh
          initialDelaySeconds: 15
          periodSeconds: 5
      volumes:
        - hostPath:
            path: /var/run/docker.sock
          name: dockersocket
        - hostPath:
            path: /proc
          name: procdir
        - hostPath:
            path: /sys/fs/cgroup
          name: cgroups

Replace YOUR_API_KEY with your api key or use Kubernetes secrets to set your API key as an environment variable.

Consult our docker integration to discover all configuration options.

  • Deploy the DaemonSet with the command: kubectl create -f datadog-agent.yaml

Note: This manifest enables autodiscovery’s auto configuration feature. To learn how to configure autodiscovery, please refer to its documentation.

Log collection setup

To enable Log collection with your DaemonSet:

  1. Set the DD_LOGS_ENABLED variable to true in your env section:

    (...)
      env:
        (...)
        - name: DD_LOGS_ENABLED
            value: "true"
    (...)
    
  2. Mount the pointdir volume in volumeMounts:

    (...)
      volumeMounts:
        (...)
        - name: pointerdir
            mountPath: /opt/datadog-agent/run
    (...)
    volumes:
      (...)
      - hostPath:
            path: /opt/datadog-agent/run
          name: pointerdir
    (...)

Learn more about this in the Docker log collection documentation.

RBAC

In the context of using the Kubernetes integration, and when deploying Agents in a Kubernetes cluster, a set of rights are required for the Agent to integrate seamlessly.

You will need to allow the Agent to be allowed to perform a few actions:

  • get and update of the Configmaps named datadogtoken to update and query the most up to date version token corresponding to the latest event stored in ETCD.
  • list and watch of the Events to pull the events from the API Server, format and submit them.
  • get, update and create for the Endpoint. The Endpoint used by the Agent for the Leader election feature is named datadog-leader-election.
  • list the componentstatuses resource, in order to submit service checks for the Controle Plane’s components status.

You can find the templates in manifests/rbac here. This will create the Service Account in the default namespace, a Cluster Role with the above rights and the Cluster Role Binding.

Custom integrations

ConfigMap

It is possible to leverage the ConfigMaps to configure or enable integrations. To do so, you only need to create a ConfigMap with the integration(s)’s configuration. Then, reference this ConfigMap among the volumes of your Agent’s manifest.

For example, in the following case we customize the name, url and tags fields of the http check. To enable other integrations, just specify the correct yaml name and make sure it is properly formated.

kind: ConfigMap
apiVersion: v1
metadata:
  name: dd-agent-config
  namespace: default
data:
  http-config: |-
    init_config:
    instances:
    - name: My service
      url: my.service:port/healthz
      tags:
        - service:critical
---

And in the manifest of your Agent (Daemonset/Deployment) add the following:

[...]
        volumeMounts:
        [...]
          - name: dd-agent-config
            mountPath: /conf.d
      volumes:
      [...]
        - name: dd-agent-config
          configMap:
            name: dd-agent-config
            items:
            - key: http-config
              path: http_check.yaml
[...]

To enable Log collection add the following lines in your http-config:

(...)
data:
  http-config: |-
  (...)
    logs:
      - type: docker
        service: docker
        source: kubernetes

Learn more about this in the Docker log collection documentation.

Annotations

It is also possible to enable integrations via the annotations in the manifest of your application. This can be done with the autodiscovery, for more details, see the Autodiscovery section.

Host installation

Installation

Install the latest version of the Datadog Agent from the Datadog Agent integration page

Configuration

Enable the kubelet check & optionally the docker check if your kubernetes is using the docker runtime:

mv /etc/datadog-agent/conf.d/kubelet.d/conf.yaml.example /etc/datadog-agent/conf.d/kubelet.d/conf.yaml.default

Edit the datadog.yaml file to point to activate autodiscovery features for the kubelet (discovery through annotations):

config_providers:
  - name: kubelet
    polling: true
listeners:
  - name: kubelet

You may now start/restart the Agent to enable the new configuration settings.

Docker check

Optionally if you’re using the docker runtime on your cluster you might want to activate the docker check as well:

mv /etc/datadog-agent/conf.d/docker.d/conf.yaml.example /etc/datadog-agent/conf.d/docker.d/conf.yaml.default

For the docker check to run properly you’ll need to add the dd-agent user to the docker group using adduser dd-agent docker

Validation

Container Running

To verify the Datadog Agent is running in your environment as a daemonset, execute:

kubectl get daemonset

If the Agent is deployed you will see output similar to the text below, where desired and current are equal to the number of nodes running in your cluster.

NAME            DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
datadog-agent   2         2         2         2            2           <none>          16h

Agent check running

Run the Agent’s status subcommand and look for kubelet under the Checks section:

Checks
======

    kubelet
    -----------
      - instance #0 [OK]
      - Collected 39 metrics, 0 events & 7 service checks

Event collection

Similarly to the Agent 5, the Agent 6 can collect events from the Kubernetes API server. First and foremost, you need to set the collect_kubernetes_events variable to true in the datadog.yaml, this can be achieved via the environment variable DD_COLLECT_KUBERNETES_EVENTS that is resolved at start time. You will need to give the Agent some rights to activate this feature. See the RBAC section.

A ConfigMap can be used to store the event.tokenKey and the event.tokenTimestamp. It has to be deployed in the default namespace and be named datadogtoken. One can simply run kubectl create configmap datadogtoken --from-literal="event.tokenKey"="0" . You can also use the example in manifests/datadog_configmap.yaml.

When the ConfigMap is used, if the Agent in charge (via the Leader election) of collecting the events dies, the next leader elected will use the ConfigMap to identify the last events pulled. This is in order to avoid duplicate the events collected, as well as putting less stress on the API Server.

Leader Election

The Datadog Agent6 supports built in leader election option for the Kubernetes event collector and the Kubernetes cluster related checks (i.e. Controle Plane service check).

This feature relies on Endpoints, you can enable it by setting the DD_LEADER_ELECTION environment variable to true the Datadog Agents will need to have a set of actions allowed prior to its deployment nevertheless. See the RBAC section for more details and keep in mind that these RBAC entities will need to be created before the option is set.

Agents coordinate by performing a leader election among members of the Datadog DaemonSet through kubernetes to ensure only one leader Agent instance is gathering events at a given time.

This functionality is disabled by default, enabling the event collection will activate it to avoid duplicating collecting events and stress on the API server.

The leaderLeaseDuration is the duration for which a leader stays elected. It should be > 30 seconds and is 60 seconds by default. The longer it is, the less frequently your Agents hit the apiserver with requests, but it also means that if the leader dies (and under certain conditions), events can be missed until the lease expires and a new leader takes over. It can be configured with the environment variable DD_LEADER_LEASE_DURATION.

Further Reading