Temporal Cloud

Cette page n'est pas encore disponible en français, sa traduction est en cours.
Si vous avez des questions ou des retours sur notre projet de traduction actuel, n'hésitez pas à nous contacter.

Overview

Temporal Cloud is a scalable platform for orchestrating complex workflows which enables developers to focus on building applications, without worrying about fault tolerance and consistency.

This integration gathers Temporal Cloud metrics into Datadog, offering insights into system health, workflow efficiency, task execution, and performance bottlenecks.

Setup

Generate a Metrics endpoint URL in Temporal Cloud

  1. To generate a CA certificate and an end-entity certificate, see certificate management.
    • Note: An expired root CA certificate invalidates all downstream certificates. To avoid disruptions to your systems, use certificates with long validity periods.
  2. Log in to Temporal Cloud with an account owner or global admin role.
  3. Go to Settings, and select the Observability tab.
  4. Under the Certificates section, add your root CA certificate (.pem file content) and save it.
    • Note: If an observability endpoint is already set up, you can append your root CA certificate.
  5. Click Save to generate the endpoint URL under the Endpoint section. The URL should look like: https://<account_id>.tmprl.cloud/prometheus.

Connect your Temporal Cloud account to Datadog

  1. Add your Account ID, End-entity Certificate file content, and End-entity Certificate key file content

    ParametersDescription
    Account IDTemporal Cloud account ID to be used as part of the metrics endpoint URL: https://<account_id>.tmprl.cloud/prometheus.
    End-entity certificate file contentContent of the end-entity certificate for secure access and communication with the Metrics endpoint.
    End-entity certificate key file contentContent of the end-entity certificate key for secure access and communication with the Metrics endpoint.
  2. Click the Save button to save your settings.

Data Collected

Metrics

temporal.cloud.v0_frontend_service_error
(count)
Increase in gRPC errors
temporal.cloud.v0_frontend_service_request
(count)
Increase in gRPC requests received
temporal.cloud.v0_poll_success
(count)
Increase in count tasks that are successfully matched to a poller
temporal.cloud.v0_poll_success_sync
(count)
Increase in count tasks that are successfully sync matched to a poller
temporal.cloud.v0_poll_timeout
(count)
When no tasks are available for a poller before timing out, this is increase in count of such tasks
temporal.cloud.v0_replication_lag_p50
(gauge)
P50 value using histogram of replication lag during a specific time interval for a multi-region Namespace
Shown as second
temporal.cloud.v0_replication_lag_p90
(gauge)
P90 value using histogram of replication lag during a specific time interval for a multi-region Namespace
Shown as second
temporal.cloud.v0_replication_lag_p95
(gauge)
P95 value using histogram of replication lag during a specific time interval for a multi-region Namespace
Shown as second
temporal.cloud.v0_replication_lag_p99
(gauge)
P99 value using histogram of replication lag during a specific time interval for a multi-region Namespace
Shown as second
temporal.cloud.v0_resource_exhausted_error
(count)
Increase in gRPC requests received that were rate-limited
temporal.cloud.v0_schedule_action_success
(count)
Increase in count of successful execution of a Scheduled Workflow
temporal.cloud.v0_schedule_buffer_overruns
(count)
When average schedule run length is greater than average schedule interval while a buffer_all overlap policy is configured, this is the increase in count of such scheduled workflow executions
temporal.cloud.v0_schedule_missed_catchup_window
(count)
Increase in count of skipped Scheduled executions when Workflows were delayed longer than the catchup window
temporal.cloud.v0_schedule_rate_limited
(count)
Increase in count of Scheduled Workflows that were delayed due to exceeding a rate limit
temporal.cloud.v0_service_latency_p50
(gauge)
P50 latency for SignalWithStartWorkflowExecution, SignalWorkflowExecution, StartWorkflowExecution operations
Shown as second
temporal.cloud.v0_service_latency_p90
(gauge)
P90 latency for SignalWithStartWorkflowExecution, SignalWorkflowExecution, StartWorkflowExecution operations
Shown as second
temporal.cloud.v0_service_latency_p95
(gauge)
P95 latency for SignalWithStartWorkflowExecution, SignalWorkflowExecution, StartWorkflowExecution operations
Shown as second
temporal.cloud.v0_service_latency_p99
(gauge)
P99 latency for SignalWithStartWorkflowExecution, SignalWorkflowExecution, StartWorkflowExecution operations
Shown as second
temporal.cloud.v0_state_transition
(count)
Increase in count of state transitions for each Namespace
temporal.cloud.v0_total_action
(count)
Increase in count of Temporal Cloud Actions
temporal.cloud.v0_workflow_cancel
(count)
Increase in count of Workflows canceled before completing execution
temporal.cloud.v0_workflow_continued_as_new
(count)
Increase in count of Workflow Executions that were Continued-As-New from a past execution
temporal.cloud.v0_workflow_failed
(count)
Increase in count of Workflows that failed before completion
temporal.cloud.v0_workflow_success
(count)
Increase in count of Workflows that successfully completed
temporal.cloud.v0_workflow_terminate
(count)
Increase in count of Workflows terminated before completing execution
temporal.cloud.v0_workflow_timeout
(count)
Increase in count of Workflows that timed out before completing execution

Service Checks

The Temporal Cloud integration does not include any service checks.

Events

The Temporal Cloud integration does not include any events.

Support

Need help? Contact Datadog support.