---
title: Detecting Application Availability using Network Insights
description: Use CNM to detect application availability
breadcrumbs: >-
  Docs > Network Monitoring > Cloud Network Monitoring > Cloud Network
  Monitoring Guides > Detecting Application Availability using Network Insights
---

# Detecting Application Availability using Network Insights

## Overview{% #overview %}

When applications rely on each other, poor connectivity or slow service calls can cause errors and latency at the application layer. Datadog's Cloud Network Monitoring (CNM) offers actionable insights for resolving application and network issues by capturing, analyzing, and correlating network metrics such as latency, packet loss, and throughput across various applications and services.

## Discovery of services and connectivity{% #discovery-of-services-and-connectivity %}

CNM is designed to track traffic between entities, determine which resources are communicating, and report their health status.

To examine the a basic traffic flow between entities, take the following steps:

1. On the [Network Analytics page](https://app.datadoghq.com/network), set your **View clients as** and **View servers as** dropdown filters to group by `service` tags to examine a service-to-service flow. Here you can observe the basic traffic unit: a source IP communicating over a port to a destination IP on a port.

   {% image
      source="https://docs.dd-static.net/images/network_performance_monitoring/guide/detecting_network_insights/cnm_service_service.9932a86f79941d27ea209de7bfb432b0.png?auto=format"
      alt="CNM analytics page, grouping by service to service with Client and Server IP highlighted" /%}

Each row aggregates 5 minutes' worth of connections. While you might recognize some IPs as specific addresses or hosts, depending on your network familiarity, this becomes challenging with larger, more complex networks. The most relevant aggregation level involves correlating each host or container associated with these IPs to tags in Datadog, such as `service`, `availability zone`, `pod`, and more, as shown in the following example.

1. Narrow down your search results using filters. For example, to view the network traffic for all of your `orders-sqlserver*` pods by host and availability zone, use the `client_pod_name:orders-sqlserver*` filter:

   {% image
      source="https://docs.dd-static.net/images/network_performance_monitoring/guide/detecting_network_insights/cnm_host_az.c5abcbd332397b4a50edfcb490e126fb.png?auto=format"
      alt="CNM analytics page, grouping by host and availability zone for specific client pod name" /%}

This first step enables you to monitor your most complex networks and being gaining insights into the connections between endpoints in your environment, such as VMs, containers, services, cloud regions, data centers, and more.

### Service-to-service dependency tracking{% #service-to-service-dependency-tracking %}

CNM tracks dependencies between services, which is essential for ensuring system performance. It helps verify important connections and highlights traffic volumes, ensuring all critical dependencies are operational.

For example, a possible cause of service latency could be too much traffic being directed to a destination endpoint, overwhelming its ability to handle incoming requests effectively.

To analyze the cause of service latency, take the following steps:

1. On the [Network Analytics](https://app.datadoghq.com/network) page, aggregate traffic by `service`, and filter for the cloud region where you may be noticing alerts or service latency. This view displays all service-to-service dependency paths within that region.

1. Sort the dependency table based on retransmits or latency, to identify connections with the most significant performance degradation. For instance, if you notice an unusually high number of TCP established connections alongside spikes in retransmits and latency, it may indicate that the source is overwhelming the destination's infrastructure with requests.

   {% image
      source="https://docs.dd-static.net/images/network_performance_monitoring/guide/detecting_network_insights/cnm_service_region_retransmits.b60098c1e7ae87f0c78f6a6536e6f5e5.png?auto=format"
      alt="CNM analytics page, grouping by service and region for specific cloud region" /%}

1. Click one of the traffic paths on this page to open the side panel. The side panel provides more detailed telemetry to help you further debug your network dependencies.

1. While on the side panel view, check the **Flows** tab to determine if the communication protocol is TCP or UDP, and review metrics like RTT, Jitter, and packets sent and received. If you're investigating a high retransmit count, this information can help you identify the cause.

   {% image
      source="https://docs.dd-static.net/images/network_performance_monitoring/guide/detecting_network_insights/cnm_sidepanel_flows.cd9ad773d9b1e0358d94e72352392821.png?auto=format"
      alt="Side panel of a traffic flow, highlighting the Flows tab" /%}

## Insight into network traffic{% #insight-into-network-traffic %}

Datadog CNM consolidates relevant distributed traces, logs, and infrastructure data into a single view, allowing you to identify and trace issues back to the originating request from an application.

In the example below, check the **Traces** tab under Network Analytics to view distributed traces of requests between source and destination endpoints, which can help you pinpoint where application-level errors occur.

To identify if an issue is an application or network issue, take can use the following steps:

1. Navigate to [**Infrastructure** > **Cloud Network** > **Analytics**](https://app.datadoghq.com/network).

1. In the **Summary** graphs, click a line of communication that has a lot of volume and high RTT time:

   {% image
      source="https://docs.dd-static.net/images/network_performance_monitoring/guide/detecting_network_insights/cnm_isolate_series.081ca19b1fb2d57826a6a47b94e71e48.png?auto=format"
      alt="CNM analytics page, clicking on a path with high RTT Time" /%}

1. Click **Isolate this series**. This opens a page that allows you to observe the network traffic only on this line of communication.

1. While on this page, click into one of the network communications paths, then click the **Flows** tab to observe RTT time:

   {% image
      source="https://docs.dd-static.net/images/network_performance_monitoring/guide/detecting_network_insights/cnm_sidepanel_rtt.a05f766a19600909069438138e7b1e93.png?auto=format"
      alt="CNM sidepanel, highlighting the RTT time column" /%}

On this page, CNM correlates network metric round-trip time (RTT) with application request latency, to help identify if the issue is a network or application issue. In this particular example, observe that the RTT time is slightly high but has come down over time and needs to be investigated further.

1. On this same page, click the **Traces** tab and investigate the **Duration** column:

   {% image
      source="https://docs.dd-static.net/images/network_performance_monitoring/guide/detecting_network_insights/cnm_traces_duration.60b8b3b2b13d47b2dbdb1e0906d675f5.png?auto=format"
      alt="CNM sidepanel, highlighting the Traces tab and duration column" /%}

Observe that although network latency (RTT) is high, the application request latency (Duration) is normal, so in this case, the issue is likely network-related, and there's no need to investigate the app code.

Conversely, *if network latency is stable but application latency (Duration) is high*, the problem likely stems from the app, and you can explore code-level traces by clicking on one of the service paths in the **Traces** tab to find the root cause, which takes you to the APM flame graph relative to this service:

   {% image
      source="https://docs.dd-static.net/images/network_performance_monitoring/guide/detecting_network_insights/cnm_apm_traces.39e098afad076ad5e61c7a63c1910e1b.png?auto=format"
      alt="APM flame graph screenshot after clicking on a service from the CNM sidepanel traces tab" /%}

### Network Map{% #network-map %}

The [Network Map](https://app.datadoghq.com/network/map) in Datadog provides a visual representation of your network topology, helping identify partitions, dependencies, and bottlenecks. It consolidates network data into a directional map, making it easier to isolate problematic areas. Additionally, it visualizes network traffic between any tagged object in your environment, from `services` to `pods` to `cloud regions`.

For complex networks in large containerized environments, Datadog's Network Map simplifies your troubleshooting by using directional arrows, or edges, to visualize real-time traffic flows between containers, pods, and deployments, even as containers change. This allows you to spot inefficiencies and misconfigurations. For example, the map can reveal if Kubernetes pods within the same cluster are communicating through an ingress controller, rather than directly to each other, indicating a misconfiguration that can cause increased latency.

To identify if there might be a communication problem with your Kubernetes pods and their underlying services, perform the following steps:

1. On the [Network Map](https://app.datadoghq.com/network/map), set the **View** dropdown to `pod_name`, the **By** dropdown to "Client Availability Zone", and set the **Metric** dropdown to "Volume Sent" (this is the [metric](https://docs.datadoghq.com/network_monitoring/cloud_network_monitoring/network_map/#usage) you want your edges to represent):

   {% image
      source="https://docs.dd-static.net/images/network_performance_monitoring/guide/detecting_network_insights/cnm_network_map.912159b3e63163ef766056984d47b99d.png?auto=format"
      alt="CNM Network Map page showing a clustering example" /%}

1. Hover over a node to observe the edges (or directional arrows) to visualize the traffic flow between clusters and availability zones. In this particular example, observe there are edges between all of your pods. If no edges are present, it could represent a misconfiguration issue.

   {% image
      source="https://docs.dd-static.net/images/network_performance_monitoring/guide/detecting_network_insights/cnm_network_map_node.f8ad498aea57752ad61af9db389d9549.png?auto=format"
      alt="CNM Network Map page showing a clustering example, highlighting a specific node" /%}

The edge thickness is associated with the metric chosen from the drop down. In this particular example, a thicker edge is associated with the metric `volume sent`. Optionally, you can also navigate directly back to the [Network Analytics](https://app.datadoghq.com/network) page by clicking on the dotted edge directly to investigate the network connections further.

   {% image
      source="https://docs.dd-static.net/images/network_performance_monitoring/guide/detecting_network_insights/cnm_network_map_thicker_edge.23c5cdf4340ae2bd0f045adff90cd1e2.png?auto=format"
      alt="CNM Network Map page showing a clustering example, highlighting a thicker edge" /%}

### Service mesh{% #service-mesh %}

Service meshes like [Istio](https://istio.io/) help manage microservice communication, but add complexity to monitoring by introducing layers of abstraction. Datadog CNM simplifies this complexity by visualizing traffic flows across Istio-managed networks and providing full visibility into the Istio environment. Datadog monitors key metrics like bandwidth and request performance, logs control plane health, and traces application requests across the mesh.

Additionally, Datadog supports [Envoy](https://istio.io/latest/docs/ops/deployment/architecture/#envoy) monitoring, correlating Istio data with the Envoy proxy mesh. Since traffic is routed through Envoy sidecars, Datadog tags them as containers, allowing users to identify and diagnose latency issues between pods and determine if they're related to the service mesh.

{% image
   source="https://docs.dd-static.net/images/network_performance_monitoring/guide/detecting_network_insights/service_mesh_edit_2.fd4ef470355490083b9f129a7d17ab4e.png?auto=format"
   alt="CNM Network Map page showing a service mesh example" /%}

## Further reading{% #further-reading %}

- [Debug application issues with APM and Cloud Network Monitoring](https://www.datadoghq.com/blog/apm-cnm-application-debugging/)
- [Best practices for getting started with Datadog Cloud Network Monitoring](https://www.datadoghq.com/blog/cnm-best-practices/)
- [How to monitor containerized and service-meshed network communication with Datadog CNM](https://www.datadoghq.com/blog/monitor-containers-with-cnm/)
