Datadog Live Containers enables real-time visibility into all containers across your environment.
Taking inspiration from bedrock tools like htop, ctop, and kubectl, live containers give you complete coverage of your container infrastructure in a continuously updated table with resource metrics at two-second resolution, faceted search, and streaming container logs.
Coupled with integrations for Docker, Kubernetes, ECS, and other container technologies, plus built-in tagging of dynamic components, the live container view provides a detailed overview of your containers’ health, resource consumption, logs, and deployment in real time:
If you’re using Kubernetes, enable Kubernetes Resources for Live Containers to gain multi-dimensional visibility into all Kubernetes workloads across your clusters. Inspired by the
kubectl tool, this feature gives you complete coverage of your Kubernetes infrastructure in a continuously updated table with curated resource metrics, faceted search, per-workload detailed view, and visualized maps.
Follow the Docker or Kubernetes Agent installation instructions. Enable the Process Agent to populate your Live Containers view. Container metrics are available without additional configuration after installation.
Kubernetes Resources for Live Containers requires installation of:
To enable Kubernetes Resources for Live Containers, follow the Helm instructions and add the following changes to your
datadog: ... processAgent: enabled: true ... orchestratorExplorer: enabled: true ... clusterAgent: enabled: true image: repository: datadog/cluster-agent tag: latest pullPolicy: Always ... agents: image: repository: datadog/agent tag: latest pullPolicy: Always ...
In cases where the Agent is not able to automatically detect the Kubernetes cluster name, you must set it in
datadog: ... clusterName: <PLACEHOLDER> ...
Note: The cluster name must be 40-characters or less.
On Google’s GKE, AWS EKS, and Azure AKS, this is unnecessary, unless the Agent and the cluster Agent don’t have access to the cloud metadata APIs, or the cluster name is longer than 40 characters.
It is possible to include and/or exclude containers from real-time collection:
DD_CONTAINER_EXCLUDEor by adding
datadog.yamlmain configuration file.
DD_CONTAINER_INCLUDEor by adding
datadog.yamlmain configuration file.
Both arguments take an image name as value; regular expressions are also supported.
For example, to exclude all Debian images except containers with a name starting with frontend, add these two configuration lines in your
env: - name: DD_LOGS_ENABLED value: "true" - name: DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL value: "true" volumeMounts: - name: pointerdir mountPath: /opt/datadog-agent/run volumes: - hostPath: path: /opt/datadog-agent/run name: pointerdir
container_exclude: ["image:debian"] container_include: ["name:frontend.*"]
Note: For Agent 5, instead of including the above in the
datadog.conf main configuration file, explicitly add a
datadog.yaml file to
/etc/datadog-agent/, as the Process Agent requires all configuration options here. This configuration only excludes containers from real-time collection, not from Autodiscovery.
Navigate to the Containers page. This will automatically bring you to the Containers view.
Containers are, by their nature, extremely high cardinality objects. Datadog’s flexible string search matches substrings in the container name, ID, or image fields.
If you’ve enabled Kubernetes Resources, strings such as
service name, as well as Kubernetes labels are searchable in a Kubernetes Resources view.
To combine multiple string searches into a complex query, you can use any of the following Boolean operators:
|Intersection: both terms are in the selected events (if nothing is added, AND is taken by default)||java AND elasticsearch|
|Union: either term is contained in the selected events||java OR python|
|Exclusion: the following term is NOT in the event. You may use the word ||java NOT elasticsearch|
equivalent: java !elasticsearch
Use parentheses to group operators together. For example,
(NOT (elasticsearch OR kafka) java) OR python.
The screenshot below displays a system that has been filtered down to a Kubernetes cluster of nine nodes. RSS and CPU utilization on containers is reported compared to the provisioned limits on the containers, when they exist. Here, it is apparent that the containers in this cluster are over-provisioned. You could use tighter limits and bin packing to achieve better utilization of resources.
Container environments are dynamic and can be hard to follow. The following screenshot displays a view that has been pivoted by
host—and, to reduce system noise, filtered to
kube_namespace:default. You can see what services are running where, and how saturated key metrics are:
You could pivot by ECS
ecs_task_version to understand changes to resource utilization between updates.
For Kubernetes resources, select Datadog tags such as
pod_phase to filter by. You can also use the container facets on the left to filter a specific Kubernetes resource. Group pods by Datadog tags to get an aggregated view which allows you to find information quicker.
Containers are tagged with all existing host-level tags, as well as with metadata associated with individual containers.
All containers are tagged by
image_name, including integrations with popular orchestrators, such as ECS and Kubernetes, which provide further container-level tags. Additionally, each container is decorated with Docker, ECS, or Kubernetes icons so you can tell which are being orchestrated at a glance.
ECS containers are tagged by:
Kubernetes containers are tagged by:
If you have configuration for Unified Service Tagging in place,
version will also be picked up automatically.
Having these tags available will let you tie together APM, logs, metrics, and live container data.
Use the scatter plot analytic to compare two metrics with one another in order to better understand the performance of your containers.
To access the scatter plot analytic in the Containers page click on the Show Summary graph button and select the “Scatter Plot” tab:
By default, the graph groups by the
short_image tag key. The size of each dot represents the number of containers in that group, and clicking on a dot displays the individual containers and hosts that contribute to the group.
The query at the top of the scatter plot analytic allows you to control your scatter plot analytic:
While actively working with the containers page, metrics are collected at a 2-second resolution. This is important for highly volatile metrics such as CPU. In the background, for historical context, metrics are collected at 10s resolution.
If you have enabled Kubernetes Resources for Live Containers, toggle between the Pods, Deployments, ReplicaSets, and Services views in the View dropdown menu in the top left corner of the page. Each of these views includes a data table to help you better organize your data by field such as status, name, and Kubernetes labels, and a detailed Cluster Map to give you a bigger picture of your pods and Kubernetes clusters.
A Kubernetes Cluster Map gives you a bigger picture of your pods and Kubernetes clusters. You can see all of your resources together on one screen with customized groups and filters, and choose which metrics to fill the color of the pods by.
Drill down into resources from Cluster Maps by click on any circle or group to populate a detailed panel.
Click on any row in the table or on any object in a Cluster Map to view information about a specific resource in a side panel. This panel is useful for troubleshooting and finding information about a selected container or resource, such as:
ip_type, or use the Group by filter in this view to group network data by tags, like
Kubernetes Resources views have a few additional tabs:
For a detailed dashboard of this resource, click the View Dashboard in the top right corner of this panel.
View streaming logs for any container like
docker logs -f or
kubectl logs -f in Datadog. Click any container in the table to inspect it. Click the Logs tab to see real-time data from Live Tail or indexed logs for any time in the past.
With Live Tail, all container logs are streamed. Pausing the stream allows you to easily read logs that are quickly being written; unpause to continue streaming.
Streaming logs can be searched with simple string matching. For more details about Live Tail, see the Live Tail documentation.
Note: Streaming logs are not persisted, and entering a new search or refreshing the page clears the stream.
You can see logs that you have chosen to index and persist by selecting a corresponding timeframe. Indexing allows you to filter your logs using tags and facets. For example, to search for logs with an
Error status, type
status:error into the search box. Autocompletion can help you locate the particular tag that you want. Key attributes about your logs are already stored in tags, which enables you to search, filter, and aggregate as needed.
healthvalue is the containers’ readiness probe, not its liveness probe.
Additional helpful documentation, links, and articles: