This tutorial walks you through the steps for enabling tracing on a sample Java application installed in a cluster on AWS Elastic Kubernetes Service (EKS). In this scenario, the Datadog Agent is also installed in the cluster.
For other scenarios, including on a host, in a container, on other cloud infrastructure, and on applications written in other languages, see the other Enabling Tracing tutorials.
The repository contains a multi-service Java application pre-configured to run inside a Kubernetes cluster. The sample app is a basic notes app with a REST API to add and change data. The docker-compose YAML files to make the containers for the Kubernetes pods are located in the docker directory. This tutorial uses the service-docker-compose-k8s.yaml file, which builds containers for the application.
In each of the notes and calendar directories, there are two sets of Dockerfiles for building the applications, either with Maven or with Gradle. This tutorial uses the Maven build, but if you are more familiar with Gradle, you can use it instead with the corresponding changes to build commands.
Kubernetes configuration files for the notes app, the calendar app, and the Datadog Agent are in the kubernetes directory.
The process of getting the sample application involves building the images from the docker folder, uploading them to a registry, and creating kubernetes resources from the kubernetes folder.
If you don’t already have an EKS cluster that you want to re-use, create one by running the following command, replacing <CLUSTER_NAME> with the name you want to use:
eksctl create cluster --name <CLUSTER_NAME>
This creates an EKS cluster with a managed nodegroup where you can deploy pods. Read the eksctl documentation on creating clusters for more information on troubleshooting and configuration. If you’re using a cluster created another way (for example by the AWS web console), ensure that the cluster is connected to your local kubeconfig file as described in the eksctl documentation.
Creating the clusters may take 15 to 20 minutes to complete. Continue to other steps while waiting for the cluster to finish creation.
To communicate with the sample applications, ensure that the cluster’s security rules are configured with ports 30080 and 30090 open.
Open AWS Console and navigate to your deployed cluster within the EKS service.
On the cluster console, select the networking tab, and click your cluster security group.
In your security group settings, edit the inbound rules. Add a rule allowing custom TCP traffic, a port range of 30060 to 30100, and source of 0.0.0.0/0.
From the /kubernetes directory, run the following command to deploy the notes app:
kubectl create -f notes-app.yaml
To exercise the app, you need to find its external IP address to call its REST API. First, find the notes-app-deploy pod in the list output by the following command, and note its node:
kubectl get pods -o wide
Then find that node name in the output from the following command, and note the external IP value:
kubectl get nodes -o wide
In the examples shown, the notes-app is running on node ip-192-189-63-129.ec2.internal, which has an external IP of 34.230.7.210.
Open up another terminal and send API requests to exercise the app. The notes application is a REST API that stores data in an in-memory H2 database running on the same container. Send it a few commands:
curl '<EXTERNAL_IP>:30080/notes'
[]
curl -X POST '<EXTERNAL_IP>:30080/notes?desc=hello'
{"id":1,"description":"hello"}
curl '<EXTERNAL_IP>:30080/notes?id=1'
{"id":1,"description":"hello"}
curl '<EXTERNAL_IP>:30080/notes'
[{"id":1,"description":"hello"}]
After you’ve seen the application running, stop it so that you can enable tracing on it:
Now that you have a working Java application, configure it to enable tracing.
Add the Java tracing package to your project. Because the Agent runs in an EKS cluster, ensure that the Dockerfiles are configured properly, and there is no need to install anything. Open the notes/dockerfile.notes.maven file and uncomment the line that downloads dd-java-agent:
RUN curl -Lo dd-java-agent.jar 'https://dtdg.co/latest-java-tracer'
Within the same notes/dockerfile.notes.maven file, comment out the ENTRYPOINT line for running without tracing. Then uncomment the ENTRYPOINT line, which runs the application with tracing enabled:
This automatically instruments the application with Datadog services.
Note: The flags on these sample commands, particularly the sample rate, are not necessarily appropriate for environments outside this tutorial. For information about what to use in your real environment, read Tracing configuration.
Universal Service Tags identify traced services across different versions and deployment environments so that they can be correlated within Datadog, and so you can use them to search and filter. The three environment variables used for Unified Service Tagging are DD_SERVICE, DD_ENV, and DD_VERSION. For applications deployed with Kubernetes, these environment variables can be added within the deployment YAML file, specifically for the deployment object, pod spec, and pod container template.
For this tutorial, the kubernetes/notes-app.yaml file already has these environment variables defined for the notes application for the deployment object, the pod spec, and the pod container template, for example:
Next, deploy the Agent to EKS to collect the trace data from your instrumented application.
Open kubernetes/datadog-values.yaml to see the minimum required configuration for the Agent and APM on GKE. This configuration file is used by the command you run next.
From the /kubernetes directory, run the following command, inserting your API key and cluster name:
For more secure deployments that do not expose the API Key, read this guide on using secrets. Also, if you use a Datadog site other than us1, replace datadoghq.com with your site.
Using the same steps as before, deploy the notes app with kubectl create -f notes-app.yaml and find the external IP address for the node it runs on.
Run some curl commands to exercise the app:
curl '<EXTERNAL_IP>:30080/notes'
[]
curl -X POST '<EXTERNAL_IP>:30080/notes?desc=hello'
{"id":1,"description":"hello"}
curl '<EXTERNAL_IP>:30080/notes?id=1'
{"id":1,"description":"hello"}
curl '<EXTERNAL_IP>:30080/notes'
[{"id":1,"description":"hello"}]
Wait a few moments, and go to APM > Traces in Datadog, where you can see a list of traces corresponding to your API calls:
The h2 is the embedded in-memory database for this tutorial, and notes is the Spring Boot application. The traces list shows all the spans, when they started, what resource was tracked with the span, and how long it took.
If you don’t see traces after several minutes, clear any filter in the Traces Search field (sometimes it filters on an environment variable such as ENV that you aren’t using).
On the Traces page, click on a POST /notes trace to see a flame graph that shows how long each span took and what other spans occurred before a span completed. The bar at the top of the graph is the span you selected on the previous screen (in this case, the initial entry point into the notes application).
The width of a bar indicates how long it took to complete. A bar at a lower depth represents a span that completes during the lifetime of a bar at a higher depth.
The flame graph for a POST trace looks something like this:
The Java tracing library uses Java’s built-in agent and monitoring support. The flag -javaagent:../dd-java-agent.jar in the Dockerfile tells the JVM where to find the Java tracing library so it can run as a Java Agent. Learn more about Java Agents at https://www.baeldung.com/java-instrumentation.
The dd.trace.sample.rate flag sets the sample rate for this application. The ENTRYPOINT command in the Dockerfile sets its value to 1, which means that 100% of all requests to the notes service are sent to the Datadog backend for analysis and display. For a low-volume test application, this is fine. Do not do this in production or in any high-volume environment, because this results in a very large volume of data. Instead, sample some of your requests. Pick a value between 0 and 1. For example, -Ddd.trace.sample.rate=0.1 sends traces for 10% of your requests to Datadog. Read more about tracing configuration settings and sampling mechanisms.
Notice that the sampling rate flag in the command appears before the -jar flag. That’s because this is a parameter for the Java Virtual Machine, not your application. Make sure that when you add the Java Agent to your application, you specify the flag in the right location.
Automatic instrumentation is convenient, but sometimes you want more fine-grained spans. Datadog’s Java DD Trace API allows you to specify spans within your code using annotations or code.
The following steps walk you through modifying the build scripts to download the Java tracing library and adding some annotations to the code to trace into some sample methods.
Delete the current application deployments:
kubectl delete -f notes-app.yaml
Open /notes/src/main/java/com/datadog/example/notes/NotesHelper.java. This example already contains commented-out code that demonstrates the different ways to set up custom tracing on the code.
Uncomment the lines that import libraries to support manual tracing:
Uncomment the lines that manually trace the two public processes. These demonstrate the use of @Trace annotations to specify aspects such as operationName and resourceName in a trace:
You can also create a separate span for a specific code block in the application. Within the span, add service and resource name tags and error handling tags. These tags result in a flame graph showing the span and metrics in Datadog visualizations. Uncomment the lines that manually trace the private method:
Tracertracer=GlobalTracer.get();// Tags can be set when creating the spanSpanspan=tracer.buildSpan("manualSpan1").withTag(DDTags.SERVICE_NAME,"NotesHelper").withTag(DDTags.RESOURCE_NAME,"privateMethod1").start();try(Scopescope=tracer.activateSpan(span)){// Tags can also be set after creationspan.setTag("postCreationTag",1);Thread.sleep(30);Log.info("Hello from the custom privateMethod1");
And also the lines that set tags on errors:
}catch(Exceptione){// Set error on spanspan.setTag(Tags.ERROR,true);span.setTag(DDTags.ERROR_MSG,e.getMessage());span.setTag(DDTags.ERROR_TYPE,e.getClass().getName());finalStringWritererrorString=newStringWriter();e.printStackTrace(newPrintWriter(errorString));span.setTag(DDTags.ERROR_STACK,errorString.toString());Log.info(errorString.toString());}finally{span.finish();}
Update your Maven build by opening notes/pom.xml and uncommenting the lines configuring dependencies for manual tracing. The dd-trace-api library is used for the @Trace annotations, and opentracing-util and opentracing-api are used for manual span creation.
Rebuild the application and upload it to ECR following the same steps as before, running these commands:
Using the same steps as before, deploy the notes app with kubectl create -f notes-app.yaml and find the external IP address for the node it runs on.
Resend some HTTP requests, specifically some GET requests.
On the Trace Explorer, click on one of the new GET requests, and see a flame graph like this:
Note the higher level of detail in the stack trace now that the getAll function has custom tracing.
The privateMethod around which you created a manual span now shows up as a separate block from the other calls and is highlighted by a different color. The other methods where you used the @Trace annotation show under the same service and color as the GET request, which is the notes application. Custom instrumentation is valuable when there are key parts of the code that need to be highlighted and monitored.
Tracing a single application is a great start, but the real value in tracing is seeing how requests flow through your services. This is called distributed tracing.
The sample project includes a second application called calendar that returns a random date whenever it is invoked. The POST endpoint in the Notes application has a second query parameter named add_date. When it is set to y, Notes calls the calendar application to get a date to add to the note.
Configure the calendar app for tracing by adding dd-java-agent to the startup command in the Dockerfile, like you previously did for the notes app. Open calendar/dockerfile.calendar.maven and see that it is already downloading dd-java-agent:
RUN curl -Lo dd-java-agent.jar 'https://dtdg.co/latest-java-tracer'
Within the same calendar/dockerfile.calendar.maven file, comment out the ENTRYPOINT line for running without tracing. Then uncomment the ENTRYPOINT line, which runs the application with tracing enabled:
Note: Again, the flags, particularly the sample rate, are not necessarily appropriate for environments outside this tutorial. For information about what to use in your real environment, read Tracing configuration.
Build both applications and publish them to ECR. From the docker directory, run:
Using the method you used before, find the external IP of the notes app.
Send a POST request with the add_date parameter:
curl -X POST '<EXTERNAL_IP>:30080/notes?desc=hello_again&add_date=y'
{"id":1,"description":"hello_again with date 2022-11-06"}
In the Trace Explorer, click this latest trace to see a distributed trace between the two services:
Note that you didn’t change anything in the notes application. Datadog automatically instruments both the okHttp library used to make the HTTP call from notes to calendar, and the Jetty library used to listen for HTTP requests in notes and calendar. This allows the trace information to be passed from one application to the other, capturing a distributed trace.
When you’re done exploring, clean up all resources and delete the deployments: