---
title: Container Troubleshooting
description: Troubleshooting containers-related issues
breadcrumbs: Docs > Containers > Container Troubleshooting
---

# Container Troubleshooting

This page provides troubleshooting information for container monitoring.

There are three methods of deploying the Agent:

1. As a [**container in a runtime**](https://docs.datadoghq.com/containers/docker.md?tab=standard)

1. In a **cloud environment**, such as [Amazon ECS](https://docs.datadoghq.com/containers/amazon_ecs.md?tab=awscli), [Fargate in an Amazon ECS environment](https://docs.datadoghq.com/integrations/ecs_fargate.md?tab=webui#), or [Amazon EKS](https://docs.datadoghq.com/integrations/eks_fargate.md)

1. In a [Kubernetes environment](https://docs.datadoghq.com/containers/kubernetes/installation.md?tab=operator)

These different methods present unique deployment challenges. Use this page as a starting point to resolve issues. If you continue to have trouble, reach out to [Datadog support](https://docs.datadoghq.com/help/) for further assistance.

For details on Agent release updates or changes, refer to Datadog's [release notes](https://app.datadoghq.com/release-notes).

## General issues{% #general-issues %}

### Environment variables are not being set, and tags are not injected{% #environment-variables-are-not-being-set-and-tags-are-not-injected %}

A useful way to inject [environment variables](https://docs.datadoghq.com/agent/guide/environment-variables.md#overview) or to configure a DogStatsD library is to implement the [Admission Controller](https://docs.datadoghq.com/containers/cluster_agent/admission_controller.md?tab=operator) feature on the Cluster Agent. **Note**: The Cluster Agent must be deployed and running *before* the application is deployed.

### Metrics are not appearing on the Datadog Web Platform{% #metrics-are-not-appearing-on-the-datadog-web-platform %}

Verify that the following are true:

- The metrics endpoint is exposed and is open for the Agent to reach.

- There are no proxies or firewalls that might impede the Agent from accessing the endpoint.

- Agent has [Autodiscovery](https://docs.datadoghq.com/getting_started/containers/autodiscovery.md?tab=adannotationsv2agent736) enabled.

### Logs are not collected{% #logs-are-not-collected %}

There are two [environment variables](https://docs.datadoghq.com/agent/guide/environment-variables.md#overview) that can effect whether logs are collected and from which containers:

- Set `DD_LOGS_ENABLED` to `true` to collect logs.
- Additionally, set `DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL` to `true` to collect all logs from all containers.

To exclude logs (and other features) from collection, see the [Container Discovery Management guide](https://docs.datadoghq.com/agent/guide/autodiscovery-management.md?tab=containerizedagent).

### Cannot connect to the Kubelet{% #cannot-connect-to-the-kubelet %}

The most common error that prevents connection to the Kubelet API is the verification of the Kubelet TLS certificate.

TLS verification is enabled by default, and may prevent the Agent from connecting to the Kubelet API through HTTPS. You can disable TLS verification by using dedicated parameters or by setting the `DD_KUBELET_TLS_VERIFY` variable for all containers in the Agent manifest:

- Set `TLS_VERIFY` to `false`.

### HPA metrics are not appearing or are not aligning with the expected value{% #hpa-metrics-are-not-appearing-or-are-not-aligning-with-the-expected-value %}

First, ensure that the Cluster Agent is deployed and able to send data to the node Agent.

Then, review the query used to scale the external metrics in the Metrics Summary. Only valid queries autoscale. If there are multiple queries, **all** queries are ignored if **any** of the queries are invalid.

When reaching out for further assistance for HPA metrics, provide the following to [Datadog support](https://docs.datadoghq.com/help/):

- A `describe` output of the HPA manifest:
  ```
  $ kubectl describe hpa > hpa.log
  ```
- A `describe` output of the DatadogMetric Custom Resource Definition:
  ```
  $ kubectl describe DatadogMetric > DatadogMetric.log
  ```

## Runtime{% #runtime %}

For logs, make sure that the Agent deployment command has `DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL` and `DD_LOGS_ENABLED` enabled.

## Cloud{% #cloud %}

Ensure that your IAM policy is updated.

### Logs are not collected in Fargate{% #logs-are-not-collected-in-fargate %}

- [ECS](https://docs.datadoghq.com/integrations/ecs_fargate.md?tab=webui#log-collection): Ensure that the log router is attached to the container from which you would like to collect logs.

- [EKS](https://docs.datadoghq.com/integrations/eks_fargate.md#log-collection): There are two common ways for the Agent to collect logs in an EKS Fargate environment: Log forwarding with CloudWatch logs, and log forwarding through [Amazon Data Firehose](https://docs.datadoghq.com/logs/guide/aws-eks-fargate-logs-with-kinesis-data-firehose.md#overview). Using Amazon Data Firehose to collect logs requires the successful implementation of the Amazon Data Firehose delivery stream, as well as some command line tools.

## Kubernetes{% #kubernetes %}

### Container not deploying or collecting metrics{% #container-not-deploying-or-collecting-metrics %}

First, ensure your API key is valid.

Then, in your node Agent Pod, run the `agent status` command and review the results.

### Not getting `kubeapi_server`, `kube_controller_manager`, or `etcd` metrics{% #not-getting-kubeapi_server-kube_controller_manager-or-etcd-metrics %}

On managed services such as Azure Kubernetes Service (AKS) and Google Kubernetes Engine (GKE), the user cannot access the control plane components. As a result, it is not possible to run the `kube_apiserver`, `kube_controller_manager`, `kube_scheduler`, or `etcd` checks in these environments.

## ECS Fargate{% #ecs-fargate %}

### Windows Agent times out while starting service{% #windows-agent-times-out-while-starting-service %}

```text
[ENTRYPOINT][ERROR] Could not start the service: The service did not respond to the start or control request in a timely fashion.
. Error: [1053 (0x41d)]
```

To avoid this error, make sure you've set a **CPU units** reservation of at least `512` for the Datadog Agent.

# Troubleshooting data requested by Datadog support{% #troubleshooting-data-requested-by-datadog-support %}

After you open a support ticket, you may be asked for the following types of information:

### Agent Flare{% #agent-flare %}

You can use the [`flare`](https://docs.datadoghq.com/agent/troubleshooting/send_a_flare.md) command to send troubleshooting information to Datadog support.

**Node Agent flare**

```
$ kubectl exec <AGENT_POD_NAME> -it agent flare <CASE_ID> 
```

**Cluster Agent flare**

```
$ kubectl exec <CLUSTER_AGENT_POD_NAME> -it agent flare <CASE_ID>
```

### Describe Pod output{% #describe-pod-output %}

This provides the team with insight on how the node or Cluster Agent was deployed, what the most recent events were for the pod, and if some qualities (such as custom tags) were injected and applied to host metrics. The `> <FILENAME>.yaml` section of the command creates a file output that can be sent to Datadog support as an attachment:

```
$ kubectl describe pod <POD_NAME> > <FILENAME>.yaml
```

### Manifest/deployment{% #manifestdeployment %}

This is the file used to deploy the Agent in your environment. It informs Datadog of the tags configured, whether logs were enabled, and if certain containers are defined to be ignored.

In the case of deploying the Agent in a runtime environment, send Support the command line used to deploy the Agent.

The three most common deployment methods are: Helm chart, DaemonSet, and Operator.

### cURL output{% #curl-output %}

If you are experiencing missing or inaccurate metrics, Datadog support may ask for the result of a cURL output of the node Agent trying to reach the metric endpoint. This is done by running the command from inside the Agent container, and can inform support if the Agent has access to the metrics. **Note**: This is not possible in a Fargate or managed services:

```
$ kubectl exec -it <AGENT_POD_NAME> curl -k -v ""<METRIC_ENDPOINT>""
```

```
$ docker exec -it <AGENT_CONTAINER_ID> curl -k -v "<METRIC_ENDPOINT>"
```

## Further Reading{% #further-reading %}

- [Duplicate hosts with Kubernetes on AWS (EC2 or EKS)](https://docs.datadoghq.com/containers/troubleshooting/duplicate_hosts.md)
