Logging is here!

Data Types and Tags

While StatsD accepts only metrics, DogStatsD accepts all three of the major Datadog data types: metrics, events, and service checks. This section shows typical use cases for each type, and introduces tagging, which is specific to DogStatsD.

Each example is in Python using the official Datadog Python client, but each data type shown is supported similarly in other DogStatsD client libraries.

Metrics

Counters, gauges, and sets are familiar to StatsD users. Histograms are specific to DogStatsD. Timers, which exist in StatsD, are a sub-set of histograms in DogStatsD.

Counters

Counters track how many times something happens per second, such as page views. In this example, we increment a metric called web.page_views each time our render_page function is called.

For Python:

def render_page():
    """ Render a web page. """
    statsd.increment('web.page_views')
    return 'Hello World!'

For Ruby:

def render_page()
  # Render a web page.
  statsd.increment('web.page_views')
  return 'Hello World!'
end

With this one line of code we can start graphing the data. Here’s an example:

graph guides metrics page views

Note that StatsD counters are normalized over the flush interval to report per-second units. In the graph above, the marker is reporting 35.33 web page views per second at ~15:24. In contrast, if one person visited the web page each second, the graph would be a flat line at y = 1. To increment or measure values over time, see gauges.

We can also count by arbitrary numbers. Suppose we wanted to count the number of bytes processed by a file uploading service. We increment a metric called file_service.bytes_uploaded by the size of the file each time our upload_file function is called:

For Python:

def upload_file(file):
    statsd.increment('file_service.bytes_uploaded', file.size())
    save_file(file)
    return 'File uploaded!'

Note that for counters coming from another source that are ever-increasing and never reset (for example, the number of queries from MySQL over time), we track the rate between flushed values. To get raw counts within Datadog, apply a function to your series such as cumulative sum or integral. Read more about Datadog functions.

Learn more about the Count type in the Metrics documentation.

Gauges

Gauges measure the value of a particular thing over time. For example, in order to track the amount of free memory on a machine, periodically sample that value as the metric system.mem.free:

For Python:

# Record the amount of free memory every ten seconds.
while True:
    statsd.gauge('system.mem.free', get_free_memory())
    time.sleep(10)

For Ruby:

# Record the amount of free memory every ten seconds.
while true do
    statsd.gauge('system.mem.free', get_free_memory())
    sleep(10)
end

Learn more about the Gauge type in the Metrics documentation.

Histograms

Histograms are specific to DogStatsD. They calculate the statistical distribution of any kind of value, such as the size of files uploaded to your site:

from datadog import statsd

def handle_file(file, file_size):
  # Handle the file...

  statsd.histogram('mywebsite.user_uploads.file_size', file_size)
  return

Histograms can also be used with timing data, for example, the duration of a metrics query:

For Python:

# Track the run time of the database query.
start_time = time.time()
results = db.query()
duration = time.time() - start_time
statsd.histogram('database.query.time', duration)

# We can also use the `timed` decorator as a short-hand for timing functions.
@statsd.timed('database.query.time')
def get_data():
    return db.query()

For Ruby:

start_time = Time.now
results = db.query()
duration = Time.now - start_time
statsd.histogram('database.query.time', duration)

# We can also use the `time` helper as a short-hand for timing blocks
# of code.
statsd.time('database.query.time') do
  return db.query()
end

The above instrumentation produces the following metrics:

  • database.query.time.count: number of times this metric was sampled
  • database.query.time.avg: average time of the sampled values
  • database.query.time.median: median sampled value
  • database.query.time.max: maximum sampled value
  • database.query.time.95percentile: 95th percentile sampled value
graph guides metrics query times

For this toy example, let’s say a query time of 1 second is acceptable. Our median query time (graphed in purple) is usually less than 100 milliseconds, which is great. But unfortunately, our 95th percentile (graphed in blue) has large spikes sometimes nearing three seconds, which is unacceptable. This means most of our queries are running just fine, but our worst ones are very bad. If the 95th percentile was close to the median, then we would know that almost all of our queries are performing just fine.

Learn more about the Histogram type in the Metrics documentation.

Timers

Timers in DogStatsD are an implementation of Histograms (not to be confused with timers in the standard StatsD). They measure timing data only, for example, the amount of time a section of code takes to execute, or how long it takes to fully render a page. In Python, timers are created with a decorator:

from datadog import statsd

@statsd.timed('mywebsite.page_render.time')
def render_page():
  # Render the page...

or with a context manager:

from datadog import statsd

def render_page():
  # First some stuff we don't want to time
  boilerplate_setup()

  # Now start the timer
  with statsd.timed('mywebsite.page_render.time'):
    # Render the page...

In either case, as DogStatsD receives the timer data, it calculates the statistical distribution of render times and sends the following metrics to Datadog:

  • mywebsite.page_render.time.count - the number of times the render time was sampled
  • mywebsite.page_render.time.avg - the average render time
  • mywebsite.page_render.time.median - the median render time
  • mywebsite.page_render.time.max - the maximum render time
  • mywebsite.page_render.time.95percentile - the 95th percentile render time

Remember: under the hood, DogStatsD treats timers as histograms. Whether you timers or histograms, you’ll be sending the same data to Datadog.

Sets

Sets are used to count the number of unique elements in a group, for example, the number of unique visitors to your site:

For Python:

def login(self, user_id):
    # Log the user in ...
    statsd.set('users.uniques', user_id)

For Ruby:

def login(self, user_id)
    # Log the user in ...
    statsd.set('users.uniques', user_id)
end

Learn more about the Histogram type in the Metrics documentation.

Metric option: sample rates

Since the overhead of sending UDP packets can be too great for some performance intensive code paths, DogStatsD clients support sampling, i.e. only sending metrics a percentage of the time. The following code sends a histogram metric only about half of the time:

dog.histogram('my.histogram', 1, sample_rate=0.5)

Before sending the metric to Datadog, DogStatsD uses the sample_rate to correct the metric value, i.e. to estimate what it would have been without sampling.

Sample rates only work with counter, histogram, and timer metrics.

Learn more about the Rates in the Metrics documentation.

Events

DogStatsD can emit events to your Datadog event stream. For example, you may want to see errors and exceptions in Datadog:

from datadog import statsd

def render_page():
  try:
    # Render the page...
    # ..
  except RenderError as err:
    statsd.event('Page render error!', err.message, alert_type='error')

Service Checks

DogStatsD can send service checks to Datadog. Use checks to track the status of services your application depends on:

For Python:

from datadog.api.constants import CheckStatus

# Report the status of an app.
name = 'web.app1'
status = CheckStatus.OK
message = 'Response: 200 OK'

statsd.service_check(check_name=name, status=status, message=message)

For Ruby:

# Report the status of an app.
name = 'web.app1'
status = Datadog::Statsd::OK
opts = {
  'message' => 'Response: 200 OK'
}

statsd.service_check(name, status, opts)

After a service check has been reported, you can use it to trigger a custom check monitor.

Tagging

You can add tags to any metric, event, or service check you send to DogStatsD. For example, you could compare the performance of two algorithms by tagging a timer metric with the algorithm version:

@statsd.timed('algorithm.run_time', tags=['algorithm:one'])
def algorithm_one():
    # Do fancy things here ...

@statsd.timed('algorithm.run_time', tags=['algorithm:two'])
def algorithm_two():
    # Do fancy things (maybe faster?) here ...

Note that tagging is a Datadog-specific extension to StatsD.

Further reading