Datadog-New Relic Integration

New Relic Dashboard

Overview

Connect to New Relic to:

  • See key New Relic metrics (like response time and Apdex score) in context with the rest of your Datadog metrics
    (only works with New Relic Pro accounts and above)
  • See New Relic alerts in your event stream

Setup

Installation

New Relic Alerts in Event Stream

  1. On the Webhook tab of New Relic’s alerting notification settings page, enter the following webhook URL: https://app.datadoghq.com/intake/webhook/newrelic?api_key={YOUR_DATADOG_API_KEY}

  2. For ‘Custom Payload’(s), select JSON ‘Payload Type’.

New Relic APM Metric Collection

  1. Locate your REST API key on New Relic’s API Keys page (Account Settings -> Integrations -> API Keys) and enter it in the form on the Datadog New Relic Integration page.

  2. To tag all metrics with your New Relic account number, add an account tag.

  3. Choose whether you want to collect your metrics per hosts or app-wide. Note: Enabling this options will import New Relic hosts to Datadog.

New Relic custom metrics may take 5-10 minutes to show up in Datadog.

Data Collected

Metrics

new_relic.apdex.score
(gauge)
Ratio of satisfactory response times to unsatisfactory response times
shown as apdex
new_relic.application_summary.apdex_score
(gauge)
Ratio of satisfactory response times to unsatisfactory response times
shown as apdex
new_relic.application_summary.apdex_target
(gauge)
Threshold ratio of satisfactory response times to unsatisfactory response times
shown as apdex
new_relic.application_summary.concurrent_instance_count
(gauge)
Number of concurrent instances serving the application
shown as instance
new_relic.application_summary.error_rate
(gauge)
Ratio of the number of errors reported by the application to the total number of served requests
shown as percent
new_relic.application_summary.host_count
(gauge)
Number of hosts serving the application
shown as host
new_relic.application_summary.instance_count
(gauge)
Number of instances serving the application
shown as instance
new_relic.application_summary.response_time
(gauge)
Application response time
shown as millisecond
new_relic.application_summary.throughput
(gauge)
Number of requests served by the application
shown as request
new_relic.database.all.average_exclusive_time
(gauge)
Average time spent in database queries, exclusive of any time instrumented by other metrics
shown as millisecond
new_relic.errors.all.errors_per_minute
(gauge)
Number of errors reported by the application
shown as error
new_relic.web_transaction.average_response_time
(gauge)
Average response time of web transactions served by the application
shown as second
new_relic.web_transaction.requests_per_minute
(gauge)
Number of web transaction requests served by the application
shown as request
new_relic.synthetics_summary.monitors.count
(gauge)
Count of monitors
shown as monitor
new_relic.synthetics_summary.monitors.frequency
(gauge)
Frequency of the monitor
shown as minute
new_relic.synthetics_summary.locations.count
(gauge)
Count of locations associated with the monitor
shown as location
new_relic.synthetic_check.duration
(gauge)
Total time for the monitor run
shown as millisecond
new_relic.synthetic_check.total_request_body_size
(gauge)
Size of the body request to the server
shown as byte
new_relic.synthetic_check.total_request_header_size
(gauge)
Size of the header request to the server
shown as byte
new_relic.synthetic_check.total_response_header_size
(gauge)
Size of the response header returned by the server
shown as byte
new_relic.synthetic_check.total_response_body_size
(gauge)
Size of the response body returned by the server
shown as byte
new_relic.synthetic_check.count
(count)
Count of monitor runs
shown as check
new_relic.synthetic_check.errors
(count)
Count of monitor failures
shown as error
new_relic.synthetic_request.count
(count)
Count of requests
shown as request
new_relic.synthetic_request.duration_blocked.average
(gauge)
Average time the requests were blocked
shown as millisecond
new_relic.synthetic_request.duration_connect.average
(gauge)
Average time the requests were establishing a connection
shown as millisecond
new_relic.synthetic_request.duration_dns.average
(gauge)
Average time the requests were resolving DNS
shown as millisecond
new_relic.synthetic_request.duration_receive.average
(gauge)
Average time the requests were receiving data
shown as millisecond
new_relic.synthetic_request.duration_send.average
(gauge)
Average time the requests were sending data
shown as millisecond
new_relic.synthetic_request.duration_ssl.average
(gauge)
Average time establishing an SSL connection
shown as millisecond
new_relic.synthetic_request.duration_wait.average
(gauge)
Average time the requests were waiting
shown as millisecond
new_relic.synthetic_request.resources_load_time
(gauge)
Average resources load time
shown as millisecond
new_relic.synthetic_request.time_spent_third_parties
(gauge)
Average time spent by third parties
shown as millisecond

Troubleshooting

What does the ‘Collect metrics by host’ option do?

When set, Datadog collects application metrics for every associated hosts, instead of the overall host throughput based average.

This makes sense when using those metrics separately, i.e. “host X has aberrant error rate Y which is problematic, though application Z overall across many hosts has an acceptable error rate in aggregate”.

This also import New Relic hosts to Datadog Infrastructure section.

I have the ‘Collect metrics by host’ option enable. Why do my application-level metrics have different values in New Relic and Datadog?

When New Relic computes the aggregate application-level value for metrics that are measured at the host level (e.g. response time), they compute a weighted average based on each host’s measured throughput.

The closest thing you’ll see in Datadog is the avg aggregator, which computes the arithmetic mean of the values. This is also the default aggregator, and what you’ll get for the simplest query, something like new_relic.web_transaction.average_response_time{*}. If your hosts all have approximately the same throughput, our average aggregation and NR’s throughput-weighted aggregation will yield similar numbers, but if thoughput is not balanced, you will see different aggregate application-level values in NR and Datadog.

For example, say you have an application with three hosts. At a specific point in time, the hosts have the following values:

           throughput    response time
hostA         305 rpm           240 ms
hostB         310 rpm           250 ms
hostC          30 rpm            50 ms

New Relic would compute the application-level response time as follows:

hostA: 240 ms * 305 rpm = 73200 total time
hostB: 250 ms * 310 rpm = 77500 total time
hostC:  50 ms *  30 rpm =  1500 total time

total throughput = 305 + 310 + 30 = 645 rpm
average response time = (73200 + 77500 + 1500) / 645 = 236.0 ms

Whereas we would simply compute the arithmetic mean:

average response time = (240 + 250 + 50) / 3 = 180.0 ms

Beta Alerts: How can I include custom tags?

You can include custom tags by utilizing the “Use Custom Payload” option through New Relic’s Beta Alerts feature. To configure this, you’ll navigate to your New Relic account, and click the ‘Alerts Beta’ button in the upper right-hand corner of the screen. From here, select the ‘Notification channels’ section and find the Webhook you’ve setup for Datadog. From here there should be a section called ‘Use Custom Payload’, and once selected, it will expand to reveal a JSON payload. You need to modify this payload by adding a “tags” attribute. For example, a modified payload might look like this:

{
  "account_id": "$ACCOUNT_ID",
  "account_name": "$ACCOUNT_NAME",
  "condition_id": "$CONDITION_ID",
  "condition_name": "$CONDITION_NAME",
  "current_state": "$EVENT_STATE",
  "details": "$EVENT_DETAILS",
  "event_type": "$EVENT_TYPE",
  "incident_acknowledge_url": "$INCIDENT_ACKNOWLEDGE_URL",
  "incident_id": "$INCIDENT_ID",
  "incident_url": "$INCIDENT_URL",
  "owner": "$EVENT_OWNER",
  "policy_name": "$POLICY_NAME",
  "policy_url": "$POLICY_URL",
  "runbook_url": "$RUNBOOK_URL",
  "severity": "$SEVERITY",
  "targets": "$TARGETS",
  "timestamp": "$TIMESTAMP",
  "tags": ["application:yourapplication", "host:yourhostname", "sometag"]
}
After your modifications are complete, make sure you select ‘Update Chanel’, for your changes to be saved.