APM Troubleshooting
Datadog の調査レポート: サーバーレスの状態 レポート: サーバーレスの状態

APM Troubleshooting


If you experience unexpected behavior with Datadog APM, there are a few common issues you can investigate and this guide may help resolve issues quickly. Reach out to Datadog support for further assistance.

Confirm APM setup and Agent status

During startup, all Datadog tracing libraries past the versions listed below emit logs that reflect the configurations applied in a JSON object, as well as any errors encountered, including if the Agent can be reached in languages where this is possible. If your tracer version includes these startup logs, start your troubleshooting there.

Java0.59 (once available)
Go1.26.0 (once available)
Python0.41 (once available)
Ruby0.38 (once available)
C++1.1.6 (once available)

Tracer debug logs

To capture full details on the Datadog tracer, enable debug mode on your tracer by using the DD_TRACE_DEBUG environment variable. You might enable it for your own investigation or because Datadog support recommended it for triage purposes. However, don’t leave debug mode always enabled because of the logging overhead it introduces.

These logs can surface instrumentation errors or integration-specific errors. For details on enabling and capturing these debug logs, see the debug mode troubleshooting page.

APM rate limits

Within Datadog Agent logs, if you see error messages about rate limits or max events per second, you can these limits by following these instructions. If you have questions, before you change the limits, consult with our support team.

Troubleshooting data requested by Datadog Support

When you open a support ticket, our support team may ask for some combination of the following types of information:

  1. How are you confirming the issue? Provide links to a trace or screenshots, for example, and tell us what you expect to see

    This allows us to confirm errors and attempt to reproduce your issues within our testing environments.

  2. Tracer Startup Logs

    Startup logs are a great way to spot misconfiguration of the tracer, or the inability for the tracer to communicate with the Datadog Agent. By comparing the configuration that the tracer sees to the one set within the application or container, we can identify areas where a setting is not being properly applied.

  3. Tracer Debug Logs

    Tracer Debug logs go one step deeper than startup logs, and will help us to identify if integrations are instrumenting properly in a manner that we aren’t able to necessarily check until traffic flows through the application. Debug logs can be extremely useful for viewing the contents of spans created by the tracer and can surface an error if there is a connection issue when attempting to send spans to the agent. Tracer debug logs are typically the most informative and reliable tool for confirming nuanced behavior of the tracer.

  4. An agent flare (snapshot of logs and configs) that captures a representative log sample of a time period when traces are sent to your agent while in debug mode

    Agent flares allow us to see what is happening within the Datadog Agent, or if traces are being rejected or malformed within the Agent. This will not help if traces are not reaching the Agent, but does help us identify the source of an issue, or any metric discrepancies.

  5. A description of your environment

    Knowing how your application is deployed helps us identify likely issues for tracer-agent communication problems or misconfigurations. For difficult issues, we may ask to a see a Kubernetes manifest or an ECS task definition, for example.

  6. Any automatic or custom instrumentation, along with any configurations

    Custom instrumentation can be a very powerful tool, but also can have unintentional side effects on your trace visualizations within Datadog, so we ask about this to rule it out as a suspect. Additionally, asking for your automatic instrumentation and configuration allows us to confirm if this matches what we are seeing in both tracer startup and debug logs.

  7. Versions of languages, frameworks, the Datadog Agent, and Tracing Library being used

    Knowing what versions are being used allows us to ensure integrations are supported, to check for known issues, or to recommend a tracer or language version upgrade if it will address the problem.

  8. Confirm Agent configurations, including if APM is enabled

    While APM is enabled by default for Agent 6+, in containerized environments there is an additional configuration step for non local traffic that can be the solution to traces not being received.

    You can check this by running the command netstat -van | grep 8126 on your Agent host. If you don’t see an entry, this means you should check that your Agent is running and check the configuration in your datadog.yaml file. Instructions can be found on the Agent Configuration page.

Further Reading