Sending large volumes of metrics
Security Monitoring is now available Security Monitoring is now available

Sending large volumes of metrics

DogStatsD works by sending metrics generated from your application to the Agent over a transport protocol. This protocol can be UDP (User Datagram Protocol) or UDS (Unix Domain Socket).

When DogStatsD is used to send a large volume of metrics to a single Agent, if proper measures are not taken, it is common to end up with the following symptoms:

  • High Agent CPU usage
  • Dropped datagrams / metrics
  • The DogStatsD client library (UDS) returning errors

Most of the time the symptoms can be alleviated by tweaking some configuration options described below.

General tips

Use Datadog official clients

We recommend that you use the latest version of the official DogStatsD clients provided by Datadog for every major programming language.

Enable buffering on your client

Some StatsD and DogStatsD clients, by default, send one metric per datagram. This adds considerable overhead on the client, the operating system, and the Agent. If your client supports buffering multiple metrics in one datagram, enabling this option brings noticeable improvements.

Here are a few examples for official DogStatsD supported clients:

By default, Datadog’s official Golang library DataDog/datadog-go uses buffering. The size of each packet and the number of messages use different default values for UDS and UDP. See DataDog/datadog-go for more information about the client configuration.

package main

import (
        "log"
        "github.com/DataDog/datadog-go/statsd"
)

func main() {
  // In this example, metrics are buffered by default with the correct default configuration for UDP.
  statsd, err := statsd.New("127.0.0.1:8125")
  if err != nil {
    log.Fatal(err)
  }

  statsd.Gauge("example_metric.gauge", 1, []string{"env:dev"}, 1)
}

By using Datadog’s official Python library datadogpy, the example below creates a buffered DogStatsD client instance that sends up to 25 metrics in one packet when the block completes:

from datadog import DogStatsd

with DogStatsd(host="127.0.0.1", port=8125, max_buffer_size=25) as batch:
    batch.gauge('example_metric.gauge_1', 123, tags=["environment:dev"])
    batch.gauge('example_metric.gauge_2', 1001, tags=["environment:dev"])

By using Datadog’s official Ruby library dogstatsd-ruby, the example below creates a buffered DogStatsD client instance that sends metrics in one packet when the block completes:

require 'datadog/statsd'

statsd = Datadog::Statsd.new('127.0.0.1', 8125)

statsd.batch do |s|
  s.increment('example_metric.increment', tags: ['environment:dev'])
  s.gauge('example_metric.gauge', 123, tags: ['environment:dev'])
end

By using Datadog’s official Java library java-dogstatsd-client, the example below creates a buffered DogStatsD client instance with a maximum packet size of 1500 bytes, meaning all metrics sent from this instance of the client are buffered and sent in packets of 1500 packet-length at most:

import com.timgroup.statsd.NonBlockingStatsDClient;
import com.timgroup.statsd.StatsDClient;
import java.util.Random;

public class DogStatsdClient {

    public static void main(String[] args) throws Exception {

        StatsDClient Statsd = new NonBlockingStatsDClientBuilder()
            .prefix("namespace").
            .hostname("127.0.0.1")
            .port(8125)
            .maxPacketSizeBytes(1500)
            .build();

        Statsd.incrementCounter("example_metric.increment", ["environment:dev"]);
        Statsd.recordGaugeValue("example_metric.gauge", 100, ["environment:dev"]);
    }
}

By using Datadog’s official C# library dogstatsd-csharp-client, the example below creates a DogStatsD client with UDP as transport:

using StatsdClient;

public class DogStatsdClient
{
    public static void Main()
    {
        var dogstatsdConfig = new StatsdConfig
        {
            StatsdServerName = "127.0.0.1",
            StatsdPort = 8125,
        };

        using (var dogStatsdService = new DogStatsdService())
        {
            dogStatsdService.Configure(dogstatsdConfig);

            // Counter and Gauge are sent in the same datagram
            dogStatsdService.Counter("example_metric.count", 2, tags: new[] { "environment:dev" });
            dogStatsdService.Gauge("example_metric.gauge", 100, tags: new[] { "environment:dev" });
        }
    }
}

By using Datadog’s official PHP library php-datadogstatsd, the example below creates a buffered DogStatsD client instance that sends metrics in one packet when the block completes:

<?php

require __DIR__ . '/vendor/autoload.php';

  use DataDog\BatchedDogStatsd;

$client = new BatchedDogStatsd(
  array('host' => '127.0.0.1',
          'port' => 8125,
     )
);

$client->increment('example_metric.increment', array('environment'=>'dev'));
$client->increment('example_metric.increment', $sampleRate->0.5 , array('environment'=>'dev'));

Sample your metrics

It is possible to reduce the traffic from your DogStatsD client to the Agent by setting a sample rate value for your client. For example, a sample rate of 0.5 halves the number of UDP packets sent. This solution is a trade-off: you decrease traffic but slightly lose in precision and granularity.

For more information and code examples, see DogStatsD “Sample Rate” Parameter Explained.

Use DogStatsD over UDS (Unix Domain Socket)

UDS is an inter-process communication protocol used to transport DogStatsD payloads. It has very little overhead when compared to UDP and lowers the general footprint of DogStatsD on your system.

Operating System kernel buffers

Most operating systems add incoming UDP and UDS datagrams containing your metrics to a buffer with a maximum size. Once the max is reached, datagrams containing your metrics start getting dropped. It is possible to adjust the values to give the Agent more time to process incoming metrics:

Over UDP (User Datagram Protocol)

Linux

On most Linux distributions, the maximum size of the kernel buffer is set to 212992 by default (208 KiB). This can be confirmed using the following commands:

$ sysctl net.core.rmem_max
net.core.rmem_max = 212992

To set the maximum size of the DogStatsD socket buffer to 25MiB run:

sysctl -w net.core.rmem_max=26214400

Add the following configuration to /etc/sysctl.conf to make this change permanent:

net.core.rmem_max = 26214400

Then set the Agent dogstatsd_so_rcvbuf configuration option to the same number in datadog.yaml:

dogstatsd_so_rcvbuf: 26214400

See the Note on sysctl in Kubernetes section if you are deploying the Agent or DogStatsD in Kubernetes.

Over UDS (Unix Domain Socket)

Linux

For UDS sockets, Linux is internally buffering datagrams in a queue if the reader is slower than the writer. The size of this queue represents the maximum number of datagrams that Linux will buffer per socket. This value can be queried with the following command:

sysctl net.unix.max_dgram_qlen

If the value is < 512, you can increase it to 512 or more using this command:

sysctl -w net.unix.max_dgram_qlen=512

Add the following configuration to /etc/sysctl.conf to make this change permanent:

net.unix.max_dgram_qlen = 512

In the same manner, the net.core.wmem_max could be incremented to 4MiB to improve client writing performances:

net.core.wmem_max = 4194304

Then set the Agent dogstatsd_so_rcvbuf configuration option to the same number in datadog.yaml:

dogstatsd_so_rcvbuf: 4194304

Note on sysctl in Kubernetes

If you are using Kubernetes to deploy the Agent and/or DogStatsD and you want to configure the sysctls as mentioned above, setting their value will have to be done per container. The net.* sysctls being namespaced, you will be able to set them per pod: see the official Kubernetes documentation on how to allow the access to the sysctls in the containers and how to set their value.

Ensure proper packet sizes

Avoid extra CPU usage by sending packets with an adequate size to the DogStatsD server in the Datadog Agent. The latest versions of the official DogStatsD clients send packets with a size optimized for performance.

You can skip this section if you are using one of the latest Datadog DogStatsD clients.

If the packets sent are too small, the Datadog Agent packs several together to process them in batches later in the pipeline. The official DogStatsD clients are capable of grouping metrics to have the best ratio of the number of metrics per packet.

The Datadog Agent performs most optimally if the DogStatsD clients send packets the size of the dogstatsd_buffer_size. The packets must not be larger than the buffer size, otherwise, the Agent won’t be able to load them completely in the buffer and some of metrics will be malformed. Use the corresponding configuration field in your DogStatsD clients.

Note for UDP: since UDP packets usually go through the Ethernet and IP layer, avoid IP packets fragmentation by limiting the packet size to a value lower than a single Ethernet frame on your network. Most of the time, IPv4 networks are configured with a MTU of 1500 bytes, so in this situation the packet size of sent packets should be limited to 1472.

Note for UDS: for the best performances, UDS packet should have a size of 8192 bytes.

Limit the maximum memory usage of the Agent

The Agent tries to absorb the burst of metrics sent by the DogStatsD clients, but to do so, it needs to use memory. Even if this is for a short amount of time and even if this memory is quickly released to the OS, a spike happens and that could be an issue in containerized environments where limit on memory usage could evict pods or containers.

Avoid sending metrics in bursts in your application - this prevents the Datadog Agent from reaching its maximum memory usage.

Another thing to look at to limit the maximum memory usage is to reduce the buffering. The main buffer of the DogStatsD server within the Agent is configurable with the dogstatsd_queue_size field (since Datadog Agent 6.1.0), its default value of 1024 induces an approximate maximum memory usage of 768MB.

Note: reducing this buffer could increase the number of packet drops.

This example decreases the max memory usage of DogStatsD to approximately 384MB:

dogstatsd_queue_size: 512

Client side telemetry

DogStatsD clients send telemetry metrics by default to the Agent. This allows you to better troubleshoot where bottlenecks exist. Each metric is tagged with the client language and the client version. These metrics are not counted as custom metrics.

Each client shares a set of common tags.

TagDescriptionExample
clientThe language of the clientclient:py
client_versionThe version of the clientclient_version:1.2.3
client_transportThe transport used by the client (udp or uds)client_transport:uds

Note: When using UDP, network errors can’t be detected by the client and the corresponding metrics will not reflect bytes/packets drop.

Starting with version 0.34.0 of the Python client.

Metrics NameMetric TypeDescription
datadog.dogstatsd.client.metricscountThe number of metrics sent to the DogStatsD client by your application (before sampling).
datadog.dogstatsd.client.eventscountThe number of events sent to the DogStatsD client by your application.
datadog.dogstatsd.client.service_checkscountThe number of service_checks sent to the DogStatsD client by your application.
datadog.dogstatsd.client.bytes_sentcountThe number of bytes successfully sent to the Agent.
datadog.dogstatsd.client.bytes_droppedcountThe number of bytes dropped by the DogStatsD client.
datadog.dogstatsd.client.packets_sentcountThe number of datagrams successfully sent to the Agent.
datadog.dogstatsd.client.packets_droppedcountThe number of datagrams dropped by the DogStatsD client.

To disable telemetry, use the disable_telemetry method:

statsd.disable_telemetry()

See DataDog/datadogpy for more information about the client configuration.

Starting with version 4.6.0 of the Ruby client.

Metrics NameMetric TypeDescription
datadog.dogstatsd.client.metricscountThe number of metrics sent to the DogStatsD client by your application (before sampling).
datadog.dogstatsd.client.eventscountThe number of events sent to the DogStatsD client by your application.
datadog.dogstatsd.client.service_checkscountThe number of service_checks sent to the DogStatsD client by your application.
datadog.dogstatsd.client.bytes_sentcountThe number of bytes successfully sent to the Agent.
datadog.dogstatsd.client.bytes_droppedcountThe number of bytes dropped by the DogStatsD client.
datadog.dogstatsd.client.packets_sentcountThe number of datagrams successfully sent to the Agent.
datadog.dogstatsd.client.packets_droppedcountThe number of datagrams dropped by the DogStatsD client.

To disable telemetry, set the disable_telemetry parameter to true:

Datadog::Statsd.new('localhost', 8125, disable_telemetry: true)

See DataDog/dogstatsd-ruby for more information about the client configuration.

Starting with version 3.4.0 of the Go client.

Metric nameMetric TypeDescription
datadog.dogstatsd.client.metricscountThe number of metrics sent to the DogStatsD client by your application (before sampling).
datadog.dogstatsd.client.eventscountThe number of events sent to the DogStatsD client by your application.
datadog.dogstatsd.client.service_checkscountThe number of service_checks sent to the DogStatsD client by your application.
datadog.dogstatsd.client.bytes_sentcountThe number of bytes successfully sent to the Agent.
datadog.dogstatsd.client.bytes_droppedcountThe number of bytes dropped by the DogStatsD client.
datadog.dogstatsd.client.bytes_dropped_queuecountThe number of bytes dropped because the DogStatsD client queue was full.
datadog.dogstatsd.client.bytes_dropped_writercountThe number of bytes dropped because of an error while writing to Datadog.
datadog.dogstatsd.client.packets_sentcountThe number of datagrams successfully sent to the Agent.
datadog.dogstatsd.client.packets_droppedcountThe number of datagrams dropped by the DogStatsD client.
datadog.dogstatsd.client.packets_dropped_queuecountThe number of datagrams dropped because the DogStatsD client queue was full.
datadog.dogstatsd.client.packets_dropped_writercountThe number of datagrams dropped because of an error while writing to Datadog.
datadog.dogstatsd.client.metric_dropped_on_receivecountThe number of metrics dropped because the internal receiving channel is full (only when using WithChannelMode()). Starting with version 3.6.0 of the Go client.

To disable telemetry, use the WithoutTelemetry setting:

statsd, err: = statsd.New("127.0.0.1:8125", statsd.WithoutTelemetry())

See DataDog/datadog-go for more information about the client configuration.

Starting with version 2.10.0 of the Java client.

Metric nameMetric TypeDescription
datadog.dogstatsd.client.metricscountThe number of metrics sent to the DogStatsD client by your application (before sampling).
datadog.dogstatsd.client.eventscountThe number of events sent to the DogStatsD client by your application.
datadog.dogstatsd.client.service_checkscountThe number of service_checks sent to the DogStatsD client by your application.
datadog.dogstatsd.client.bytes_sentcountThe number of bytes successfully sent to the Agent.
datadog.dogstatsd.client.bytes_droppedcountThe number of bytes dropped by the DogStatsD client.
datadog.dogstatsd.client.packets_sentcountThe number of datagrams successfully sent to the Agent.
datadog.dogstatsd.client.packets_droppedcountThe number of datagrams dropped by the DogStatsD client.
datadog.dogstatsd.client.packets_dropped_queuecountThe number of datagrams dropped because the DogStatsD client queue was full.

To disable telemetry, use the enableTelemetry(false) builder option:

StatsDClient client = new NonBlockingStatsDClientBuilder()
    .hostname("localhost")
    .port(8125)
    .enableTelemetry(false)
    .build();

See DataDog/java-dogstatsd-client for more information about the client configuration.

Starting with version 1.5.0 of the PHP client the telemetry is enabled by default for the BatchedDogStatsd client and disabled by default for the DogStatsd client.

Metrics NameMetric TypeDescription
datadog.dogstatsd.client.metricscountThe number of metrics sent to the DogStatsD client by your application (before sampling).
datadog.dogstatsd.client.eventscountThe number of events sent to the DogStatsD client by your application.
datadog.dogstatsd.client.service_checkscountThe number of service_checks sent to the DogStatsD client by your application.
datadog.dogstatsd.client.bytes_sentcountThe number of bytes successfully sent to the Agent.
datadog.dogstatsd.client.bytes_droppedcountThe number of bytes dropped by the DogStatsD client.
datadog.dogstatsd.client.packets_sentcountThe number of datagrams successfully sent to the Agent.
datadog.dogstatsd.client.packets_droppedcountThe number of datagrams dropped by the DogStatsD client.

To enable or disable telemetry use the disable_telemetry argument. Beware, using the telemetry with the DogStatsd client will increase network usage significantly, it is advise to use the BatchedDogStatsd when using the telemetry.

To enable it on the DogStatsd client:

use DataDog\DogStatsd;

$statsd = new DogStatsd(
    array('host' => '127.0.0.1',
          'port' => 8125,
          'disable_telemetry' => false,
      )
  );

To disable telemetry on the BatchedDogStatsd client:

use DataDog\BatchedDogStatsd;

$statsd = new BatchedDogStatsd(
    array('host' => '127.0.0.1',
          'port' => 8125,
          'disable_telemetry' => true,
      )
  );

See DataDog/php-datadogstatsd for more information about the client configuration.

Starting with version 5.0.0 of the .NET client.

Metric nameMetric TypeDescription
datadog.dogstatsd.client.metricscountNumber of metrics sent to the DogStatsD client by your application (before sampling).
datadog.dogstatsd.client.eventscountNumber of events sent to the DogStatsD client by your application.
datadog.dogstatsd.client.service_checkscountNumber of service_checks sent to the DogStatsD client by your application.
datadog.dogstatsd.client.bytes_sentcountNumber of bytes successfully sent to the Agent.
datadog.dogstatsd.client.bytes_droppedcountNumber of bytes dropped by the DogStatsD client.
datadog.dogstatsd.client.packets_sentcountNumber of datagrams successfully sent to the Agent.
datadog.dogstatsd.client.packets_droppedcountNumber of datagrams dropped by the DogStatsD client.
datadog.dogstatsd.client.packets_dropped_queuecountNumber of datagrams dropped because the DogStatsD client queue was full.

To disable telemetry, set TelemetryFlushInterval at null:

var dogstatsdConfig = new StatsdConfig
{
    StatsdServerName = "127.0.0.1",
    StatsdPort = 8125,
};

// Disable Telemetry
dogstatsdConfig.Advanced.TelemetryFlushInterval = null;

See DataDog/dogstatsd-csharp-client for more information about the client configuration.

Further Reading