Operations Monitoring

이 페이지는 아직 영어로 제공되지 않습니다. 번역 작업 중입니다.
현재 번역 프로젝트에 대한 질문이나 피드백이 있으신 경우 언제든지 연락주시기 바랍니다.
Join the Preview!

Operations Monitoring is in Preview.

Request Access

Overview

Operations tab under RUM > Performance Monitoring

In Datadog RUM, a feature represents a major user-facing area of your application like checkout, login, or search. Each feature includes operations, which are the critical technical steps that make the experience work.

  • Business teams use features to track and improve user conversion.
  • Engineering teams use operations to monitor and minimize technical failures that impact key user moments.

For example, the checkout experience of an e-commerce platform is a feature. Within it, operations might include entering payment details, saving a payment method, and completing a purchase. After the SDK has been instrumented, Datadog RUM measures each operation’s performance, including execution volume, completion rate, and failure rate. Measuring operations’ health enables you to identify exactly when and why users may not convert in your feature.

The following table shows additional example features and their associated feature operations by industry.

IndustryFeatureFeature Operations
Social networkProfileUsers can load their profile
Users can upload a picture
Users can update their status
E-CommerceCheckoutUsers can enter payment details
Users can save their payment method
Users can pay
StreamingSearchUsers can find results for their search
Users can load the description of a title
Users can start watching the trailer
CRMQuoteUsers can start a new quote
Users can add line items to the quote
Users can send a quote to recipients

Prerequisites

Setup

Use the SDK APIs to define your operations.

Start an operation

Every operation must be started by calling the startFeatureOperation.

DD_RUM.init({
...,
enableExperimentalFeatures: ["feature_operation_vital"], // you need to have this flag turned on for the API to work
})

startFeatureOperation: (
name: string, 
options?: {
 operationKey?: string,
 context?: Context,
 description?: string,
}) => void
GlobalRumMonitor.get().startFeatureOperation(
	name: String,
	operationKey: String?,
	attributes: Map<String, Any?>
)
RUMMonitor.shared().startFeatureOperation(
	name: String,
	operationKey: String?,
	attributes: [AttributeKey: AttributeValue]?
)

Stop an operation with success

Every started operation must have a stop. Use succeedFeatureOperation to stop an operation with a successful outcome.

DD_RUM.init({
...,
enableExperimentalFeatures: ["feature_operation_vital"], // this flag needs to be enabled for the API to work
})

succeedFeatureOperation: (
name: string, 
options?: {
 operationKey?: string,
 context?: Context,
 description?: string,
}) => void
GlobalRumMonitor.get().succeedFeatureOperation(
	name: String,
	operationKey: String?,
	attributes: Map<String, Any?>
)
RUMMonitor.shared().succeedFeatureOperation(
	name: String,
	operationKey: String?,
	attributes: [AttributeKey: AttributeValue]?
)

Stop an operation with failure

Every started operation must have a stop. Use failFeatureOperation to stop an operation with a failure outcome.

DD_RUM.init({
...,
enableExperimentalFeatures: ["feature_operation_vital"], // this flag needs to be enabled for the API to work
})

GlobalRumMonitor.get().failFeatureOperation: (
name: string, 
failureReason: FailureReason, //'error' | 'abandoned' | 'timeout'| 'other'
options?: {
 operationKey?: string,
 context?: Context,
 description?: string,
}) => void
GlobalRumMonitor.get().failFeatureOperation(
	name: String,
	operationKey: String?,
	reason: RUMFeatureOperationFailureReason,	// .error, .abandoned, timeout, .other
	attributes: Map<String, Any?>
)
RUMMonitor.shared().failFeatureOperation(
	name: String,
	operationKey: String?,
    reason: RUMFeatureOperationFailureReason,  // .error, .abandoned, .timeout, .other
	attributes: [AttributeKey: AttributeValue]
)

Parallelization

You may have cases where users are starting several feature operations in parallel. To individually track them, use the operationKey defined when calling startFeatureOperation. You must reuse the same operationKey later in other APIs, for example when calling succeedFeatureOperation.

Operations that have been started but not explicitly stopped are automatically terminated when the RUM session expires. Those are marked as failed, with @operation.failure_reason:timeout.

If an operation stop API was called that was not started in the first place, the stop event emitted by the SDK is dropped upon ingestion.

Monitor your availability on Datadog

Operations tab under RUM > Performance Monitoring

After you’ve configured the SDK APIs, you can monitor your operations by navigating to RUM > Performance Monitoring > Operations.

Datadog groups together all operations with the same name into a catalog.

Each operation has two out-of-the-box metrics computed over your full, ingested, unsampled traffic:

  • rum.measure.operation, which counts the volume of operations reported to Datadog
  • rum.measure.operation.duration, which measures the elapsed time between the start and end of all the operations reported to Datadog

Both metrics are retained for 15 months, and include several dimensions:

  • operation.name, which is defined on the client side
  • operation.status, which is either a success or failure
  • operation.failure_reason, which can be an error, or abandoned, or timeout, or other

Those metrics are included in the price of RUM Measure and available to all RUM without Limits customers that define one or more operations.

Configure retention filters

Operations are a new type of event in RUM. Operations are bound to a RUM Session, but can span across multiple RUM Views. Operations can be targeted in retention filters. This allows you to align your retention strategy on features that are cornerstones for your user experiences. For example, you can programmatically keep RUM Sessions that had specific operations fail or are taking longer than desired.

Operations tab under RUM > Performance Monitoring

Similarly to metrics, those events come with specific attributes you can use in retention filters:

  • @operation.name
  • @operation.status
  • @operation.failure_reason
  • @operation.duration
  • @operation.start_view.name
  • @operation.end_view.name

Further Reading

추가 유용한 문서, 링크 및 기사: