Temporal

Supported OS Linux Windows Mac OS

Integration version2.2.0

Overview

This check monitors Temporal through the Datadog Agent.

Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.

Installation

The Temporal check is included in the Datadog Agent package. No additional installation is needed on your server.

Configuration

  1. Configure your Temporal services to expose metrics via a prometheus endpoint by following the official Temporal documentation.

  2. Edit the temporal.d/conf.yaml file located in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your Temporal performance data.

To get started, configure the openmetrics_endpoint option to match the listenAddress and handlerPath options from your Temporal server configuration.

Note that when Temporal services in a cluster are deployed independently, every service exposes its own metrics. As a result, you need to configure the prometheus endpoint for every service that you want to monitor and define a separate instance on the integration’s configuration for each of them.

See the sample temporal.d/conf.yaml for all available configuration options.

Log collection

  1. Collecting logs is disabled by default in the Datadog Agent. Enable it in your datadog.yaml file:

    logs_enabled: true
    
  2. Configure your Temporal Cluster to output logs to a file by following the official documentation.

  3. Uncomment and edit the logs configuration block in your temporal.d/conf.yaml file, and set the path to point to the file you configured on your Temporal Cluster:

logs:
  - type: file
    path: /var/log/temporal/temporal-server.log
    source: temporal
  1. Restart the Agent.

Validation

Run the Agent’s status subcommand and look for temporal under the Checks section.

Data Collected

Metrics

temporal.server.accept_workflow_update_message.count
(count)
temporal.server.ack_level.update.count
(count)
temporal.server.ack_level.update.failed.count
(count)
temporal.server.acquire_lock_failed.count
(count)
temporal.server.acquire_shards.count
(count)
temporal.server.acquire_shards.latency.bucket
(count)
temporal.server.acquire_shards.latency.count
(count)
temporal.server.acquire_shards.latency.sum
(count)

Shown as millisecond
temporal.server.action.count
(count)
temporal.server.activity.eager_execution.count
(count)
temporal.server.activity.end_to_end_latency.bucket
(count)
temporal.server.activity.end_to_end_latency.count
(count)
temporal.server.activity.end_to_end_latency.sum
(count)

Shown as millisecond
temporal.server.activity_info.bucket
(count)
temporal.server.activity_info.count
(count)
temporal.server.activity_info.size.bucket
(count)
temporal.server.activity_info.size.count
(count)
temporal.server.activity_info.size.sum
(count)

Shown as byte
temporal.server.activity_info.sum
(count)
temporal.server.add_search_attributes.failures.count
(count)
temporal.server.add_search_attributes.workflow_failure.count
(count)
temporal.server.add_search_attributes.workflow_success.count
(count)
temporal.server.archival.task_invalid_uri.count
(count)
temporal.server.archiver.archive.latency.bucket
(count)
temporal.server.archiver.archive.latency.count
(count)
temporal.server.archiver.archive.latency.sum
(count)

Shown as millisecond
temporal.server.archiver.archive.target_latency.bucket
(count)
temporal.server.archiver.archive.target_latency.count
(count)
temporal.server.archiver.archive.target_latency.sum
(count)

Shown as millisecond
temporal.server.archiver.backlog.size.count
(count)
temporal.server.archiver.client.history.inline_archive.attempt.count
(count)
temporal.server.archiver.client.history.inline_archive.failure.count
(count)
temporal.server.archiver.client.history.request.count
(count)
temporal.server.archiver.client.send_signal_error.count
(count)
temporal.server.archiver.client.sent_signal.count
(count)
temporal.server.archiver.client.visibility.inline_archive_attempt.count
(count)
temporal.server.archiver.client.visibility.inline_archive_failure.count
(count)
temporal.server.archiver.client.visibility.request.count
(count)
temporal.server.archiver.coroutine.started.count
(count)
temporal.server.archiver.coroutine.stopped.count
(count)
temporal.server.archiver.delete.failed_all_retries.count
(count)
temporal.server.archiver.delete.success.count
(count)
temporal.server.archiver.delete.with_retries.latency.bucket
(count)
temporal.server.archiver.delete.with_retries.latency.count
(count)
temporal.server.archiver.delete.with_retries.latency.sum
(count)

Shown as millisecond
temporal.server.archiver.handle_all_requests_latency.bucket
(count)
temporal.server.archiver.handle_all_requests_latency.count
(count)
temporal.server.archiver.handle_all_requests_latency.sum
(count)

Shown as millisecond
temporal.server.archiver.handle_history.request.latency.bucket
(count)
temporal.server.archiver.handle_history.request.latency.count
(count)
temporal.server.archiver.handle_history.request.latency.sum
(count)

Shown as millisecond
temporal.server.archiver.handle_visibility.failed_all_retries.count
(count)
temporal.server.archiver.handle_visibility.request.latency.bucket
(count)
temporal.server.archiver.handle_visibility.request.latency.count
(count)
temporal.server.archiver.handle_visibility.request.latency.sum
(count)

Shown as millisecond
temporal.server.archiver.handle_visibility.success.count
(count)
temporal.server.archiver.non_retryable_error.count
(count)
temporal.server.archiver.num_handled_requests.count
(count)
temporal.server.archiver.num_pumped_requests.count
(count)
temporal.server.archiver.pump.signal_channel_closed.count
(count)
temporal.server.archiver.pump.signal_threshold.count
(count)
temporal.server.archiver.pump.timeout.count
(count)
temporal.server.archiver.pump.timeout_without_signals.count
(count)
temporal.server.archiver.pumped_not_equal_handled.count
(count)
temporal.server.archiver.started.count
(count)
temporal.server.archiver.stopped.count
(count)
temporal.server.archiver.upload.failed_all_retries.count
(count)
temporal.server.archiver.upload.success.count
(count)
temporal.server.archiver.upload.with_retries.latency.bucket
(count)
temporal.server.archiver.upload.with_retries.latency.count
(count)
temporal.server.archiver.upload.with_retries.latency.sum
(count)

Shown as millisecond
temporal.server.archiver.workflow.started.count
(count)
temporal.server.archiver.workflow.stopping.count
(count)
temporal.server.asyncmatch.latency.bucket
(count)
Distribution of latencies from creation to delivery for async matched tasks.
temporal.server.asyncmatch.latency.count
(count)
Count of latencies from creation to delivery for async matched tasks.
temporal.server.asyncmatch.latency.sum
(count)
Sum of time latencies creation to delivery for async matched tasks.
Shown as millisecond
temporal.server.auto_reset_point.corruption.count
(count)
temporal.server.auto_reset_points.exceed_limit.count
(count)
temporal.server.batcher.operation_errors.count
(count)
temporal.server.batcher.processor_errors.count
(count)
temporal.server.batcher.processor_requests.count
(count)
temporal.server.buffer_replication_tasks.bucket
(count)
temporal.server.buffer_replication_tasks.count
(count)
temporal.server.buffer_replication_tasks.sum
(count)

Shown as millisecond
temporal.server.buffer_throttle.count
(count)
temporal.server.buffered_events.bucket
(count)
temporal.server.buffered_events.count
(count)
temporal.server.buffered_events.size.bucket
(count)
temporal.server.buffered_events.size.count
(count)
temporal.server.buffered_events.size.sum
(count)

Shown as byte
temporal.server.buffered_events.sum
(count)
temporal.server.cache.errors.count
(count)
temporal.server.cache.latency.bucket
(count)
temporal.server.cache.latency.count
(count)
temporal.server.cache.latency.sum
(count)

Shown as millisecond
temporal.server.cache.miss.count
(count)
temporal.server.cache.requests.count
(count)
temporal.server.cancel_activity_command.count
(count)
temporal.server.cancel_external_workflow_command.count
(count)
temporal.server.cancel_timer_command.count
(count)
temporal.server.cancel_workflow_command.count
(count)
temporal.server.catchup.ready_shard_count
(gauge)
temporal.server.certificates.expired
(gauge)
temporal.server.certificates.expiring
(gauge)
temporal.server.child_info.bucket
(count)
temporal.server.child_info.count
(count)
temporal.server.child_info.size.bucket
(count)
temporal.server.child_info.size.count
(count)
temporal.server.child_info.size.sum
(count)

Shown as byte
temporal.server.child_info.sum
(count)
temporal.server.child_workflow_command.count
(count)
temporal.server.client.errors.count
(count)
An indicator for connection issues between different Server roles.
temporal.server.client.latency.bucket
(count)
temporal.server.client.latency.count
(count)
temporal.server.client.latency.sum
(count)

Shown as millisecond
temporal.server.client.redirection.errors.count
(count)
temporal.server.client.redirection.latency.bucket
(count)
temporal.server.client.redirection.latency.count
(count)
temporal.server.client.redirection.latency.sum
(count)

Shown as millisecond
temporal.server.client.redirection.requests.count
(count)
temporal.server.client.requests.count
(count)
temporal.server.closed_workflow_buffer_event_counter.count
(count)
temporal.server.cluster_metadata.callback.lock_latency.bucket
(count)
temporal.server.cluster_metadata.callback.lock_latency.count
(count)
temporal.server.cluster_metadata.callback.lock_latency.sum
(count)

Shown as millisecond
temporal.server.cluster_metadata.lock_latency.bucket
(count)
temporal.server.cluster_metadata.lock_latency.count
(count)
temporal.server.cluster_metadata.lock_latency.sum
(count)

Shown as millisecond
temporal.server.complete_task_fail.count
(count)
temporal.server.complete_workflow_command.count
(count)
temporal.server.complete_workflow_task_sticky.disabled.count
(count)
temporal.server.complete_workflow_task_sticky.enabled.count
(count)
temporal.server.complete_workflow_update_message.count
(count)
temporal.server.concurrency_update_failure.count
(count)
temporal.server.condition_failed_errors.count
(count)
temporal.server.consistent_query_timeout.count
(count)
temporal.server.continue_as_new_command.count
(count)
temporal.server.count_executions.failures.count
(count)
temporal.server.delete_execution.failures.count
(count)
temporal.server.delete_execution.not_found.count
(count)
temporal.server.delete_executions.success.count
(count)
temporal.server.delete_namespace.failures.count
(count)
temporal.server.delete_namespace.success.count
(count)
temporal.server.delete_namespace.workflow_failure.count
(count)
temporal.server.delete_namespace.workflow_success.count
(count)
temporal.server.direct_query_dispatch.clear_stickiness.latency.bucket
(count)
temporal.server.direct_query_dispatch.clear_stickiness.latency.count
(count)
temporal.server.direct_query_dispatch.clear_stickiness.latency.sum
(count)

Shown as millisecond
temporal.server.direct_query_dispatch.clear_stickiness.success.count
(count)
temporal.server.direct_query_dispatch.latency.bucket
(count)
temporal.server.direct_query_dispatch.latency.count
(count)
temporal.server.direct_query_dispatch.latency.sum
(count)

Shown as millisecond
temporal.server.direct_query_dispatch.non_sticky.latency.bucket
(count)
temporal.server.direct_query_dispatch.non_sticky.latency.count
(count)
temporal.server.direct_query_dispatch.non_sticky.latency.sum
(count)

Shown as millisecond
temporal.server.direct_query_dispatch.non_sticky.success.count
(count)
temporal.server.direct_query_dispatch.sticky.latency.bucket
(count)
temporal.server.direct_query_dispatch.sticky.latency.count
(count)
temporal.server.direct_query_dispatch.sticky.latency.sum
(count)

Shown as millisecond
temporal.server.direct_query_dispatch.sticky.success.count
(count)
temporal.server.direct_query_dispatch.timeout_before_non_sticky.count
(count)
temporal.server.duplicate_replication_events.count
(count)
temporal.server.elasticsearch.bulk_processor.bulk_size.bucket
(count)
temporal.server.elasticsearch.bulk_processor.bulk_size.count
(count)
temporal.server.elasticsearch.bulk_processor.bulk_size.sum
(count)
temporal.server.elasticsearch.bulk_processor.commit.latency.bucket
(count)
temporal.server.elasticsearch.bulk_processor.commit.latency.count
(count)
temporal.server.elasticsearch.bulk_processor.commit.latency.sum
(count)

Shown as millisecond
temporal.server.elasticsearch.bulk_processor.corrupted_data.count
(count)
temporal.server.elasticsearch.bulk_processor.duplicate_request.count
(count)
temporal.server.elasticsearch.bulk_processor.errors.count
(count)
temporal.server.elasticsearch.bulk_processor.queued_requests.bucket
(count)
temporal.server.elasticsearch.bulk_processor.queued_requests.count
(count)
temporal.server.elasticsearch.bulk_processor.queued_requests.sum
(count)
temporal.server.elasticsearch.bulk_processor.request.latency.bucket
(count)
temporal.server.elasticsearch.bulk_processor.request.latency.count
(count)
temporal.server.elasticsearch.bulk_processor.request.latency.sum
(count)

Shown as millisecond
temporal.server.elasticsearch.bulk_processor.requests.count
(count)
temporal.server.elasticsearch.bulk_processor.wait_add.latency.bucket
(count)
temporal.server.elasticsearch.bulk_processor.wait_add.latency.count
(count)
temporal.server.elasticsearch.bulk_processor.wait_add.latency.sum
(count)

Shown as millisecond
temporal.server.elasticsearch.bulk_processor.wait_start.latency.bucket
(count)
temporal.server.elasticsearch.bulk_processor.wait_start.latency.count
(count)
temporal.server.elasticsearch.bulk_processor.wait_start.latency.sum
(count)

Shown as millisecond
temporal.server.elasticsearch.document.generate_failures_counter.count
(count)
temporal.server.elasticsearch.document.parse_failures_counter.count
(count)
temporal.server.empty_completion_commands.count
(count)
temporal.server.empty_replication_events.count
(count)
temporal.server.event.blob_size.bucket
(count)
temporal.server.event.blob_size.count
(count)
temporal.server.event.blob_size.sum
(count)

Shown as byte
temporal.server.event.reapply_skipped.count
(count)
temporal.server.execution_info.size.bucket
(count)
temporal.server.execution_info.size.count
(count)
temporal.server.execution_info.size.sum
(count)

Shown as byte
temporal.server.execution_state.size.bucket
(count)
temporal.server.execution_state.size.count
(count)
temporal.server.execution_state.size.sum
(count)

Shown as byte
temporal.server.executions_outstanding
(gauge)
temporal.server.executor.deferred.count
(count)
temporal.server.executor.done.count
(count)
temporal.server.executor.dropped.count
(count)
temporal.server.executor.err.count
(count)
temporal.server.fail_workflow_command.count
(count)
temporal.server.failed_workflow_tasks.count
(count)
temporal.server.forward_poll.calls.count
(count)
temporal.server.forward_poll.errors.count
(count)
temporal.server.forward_poll.latency.bucket
(count)
temporal.server.forward_poll.latency.count
(count)
temporal.server.forward_poll.latency.sum
(count)

Shown as millisecond
temporal.server.forward_query.calls.count
(count)
temporal.server.forward_query.errors.count
(count)
temporal.server.forward_query.latency.bucket
(count)
temporal.server.forward_query.latency.count
(count)
temporal.server.forward_query.latency.sum
(count)

Shown as millisecond
temporal.server.forward_task.calls.count
(count)
temporal.server.forward_task.errors.count
(count)
temporal.server.forward_task.latency.bucket
(count)
temporal.server.forward_task.latency.count
(count)
temporal.server.forward_task.latency.sum
(count)

Shown as millisecond
temporal.server.forwarded.count
(count)
temporal.server.forwarded_per_tl.count
(count)
temporal.server.get_dlq_replication_messages.bucket
(count)
temporal.server.get_dlq_replication_messages.count
(count)
temporal.server.get_dlq_replication_messages.sum
(count)

Shown as millisecond
temporal.server.get_engine_for_shard.errors.count
(count)
temporal.server.get_engine_for_shard.latency.bucket
(count)
temporal.server.get_engine_for_shard.latency.count
(count)
temporal.server.get_engine_for_shard.latency.sum
(count)

Shown as millisecond
temporal.server.get_replication_messages_for_shard.bucket
(count)
temporal.server.get_replication_messages_for_shard.count
(count)
temporal.server.get_replication_messages_for_shard.sum
(count)

Shown as millisecond
temporal.server.handover.ready_shard_count
(gauge)
temporal.server.heartbeat.timeout.count
(count)
temporal.server.history.archiver.archive.non_retryable_error.count
(count)
temporal.server.history.archiver.archive.success.count
(count)
temporal.server.history.archiver.archive.transient_error.count
(count)
temporal.server.history.archiver.blob_exists.count
(count)
temporal.server.history.archiver.blob_integrity_check_failed.count
(count)
temporal.server.history.archiver.blob_size.bucket
(count)
temporal.server.history.archiver.blob_size.count
(count)
temporal.server.history.archiver.blob_size.sum
(count)

Shown as byte
temporal.server.history.archiver.deterministic_construction_check_failed.count
(count)
temporal.server.history.archiver.duplicate_archivals.count
(count)
temporal.server.history.archiver.history_mutated.count
(count)
temporal.server.history.archiver.history_size.bucket
(count)
temporal.server.history.archiver.history_size.count
(count)
temporal.server.history.archiver.history_size.sum
(count)

Shown as byte
temporal.server.history.archiver.running_blob_integrity_check.count
(count)
temporal.server.history.archiver.running_deterministic_construction_check.count
(count)
temporal.server.history.archiver.total_upload_size.bucket
(count)
temporal.server.history.archiver.total_upload_size.count
(count)
temporal.server.history.archiver.total_upload_size.sum
(count)

Shown as byte
temporal.server.history.bucket
(count)
temporal.server.history.conflicts.count
(count)
temporal.server.history.count
(count)
temporal.server.history.event_notification.fail_delivery.count
(count)
temporal.server.history.event_notification.fanout_latency.bucket
(count)
temporal.server.history.event_notification.fanout_latency.count
(count)
temporal.server.history.event_notification.fanout_latency.sum
(count)

Shown as millisecond
temporal.server.history.event_notification.inflight_message
(gauge)
temporal.server.history.event_notification.queueing_latency.bucket
(count)
temporal.server.history.event_notification.queueing_latency.count
(count)
temporal.server.history.event_notification.queueing_latency.sum
(count)

Shown as millisecond
temporal.server.history.size.bucket
(count)
temporal.server.history.size.count
(count)
temporal.server.history.size.sum
(count)

Shown as byte
temporal.server.history.sum
(count)
temporal.server.history.workflow_execution_cache_latency.bucket
(count)
temporal.server.history.workflow_execution_cache_latency.count
(count)
temporal.server.history.workflow_execution_cache_latency.sum
(count)

Shown as millisecond
temporal.server.inordered_buffered_events.count
(count)
temporal.server.invalid_task_queue_name.count
(count)
temporal.server.last_processed_message_id
(gauge)
temporal.server.last_retrieved_message_id
(gauge)
temporal.server.lease.failures.count
(count)
temporal.server.lease.requests.count
(count)
temporal.server.list_executions.failures.count
(count)
temporal.server.loaded_task_queue_count
(gauge)
temporal.server.local_to_local.matches.count
(count)
temporal.server.local_to_remote.matches.count
(count)
temporal.server.lock.failures.count
(count)
temporal.server.lock.latency.bucket
(count)
temporal.server.lock.latency.count
(count)
temporal.server.lock.latency.sum
(count)

Shown as millisecond
temporal.server.lock.requests.count
(count)
temporal.server.membership_changed.count
(count)
temporal.server.memo_size.bucket
(count)
temporal.server.memo_size.count
(count)
temporal.server.memo_size.sum
(count)

Shown as byte
temporal.server.modify_workflow_properties_command.count
(count)
temporal.server.multiple_completion_commands.count
(count)
temporal.server.mutable_state.size.bucket
(count)
temporal.server.mutable_state.size.count
(count)
temporal.server.mutable_state.size.sum
(count)

Shown as byte
temporal.server.mutable_state_checksum.invalidated.count
(count)
temporal.server.mutable_state_checksum.mismatch.count
(count)
temporal.server.namespace_cache.callbacks_latency.bucket
(count)
temporal.server.namespace_cache.callbacks_latency.count
(count)
temporal.server.namespace_cache.callbacks_latency.sum
(count)

Shown as millisecond
temporal.server.namespace_cache.prepare_callbacks_latency.bucket
(count)
temporal.server.namespace_cache.prepare_callbacks_latency.count
(count)
temporal.server.namespace_cache.prepare_callbacks_latency.sum
(count)

Shown as millisecond
temporal.server.namespace_dlq.ack_level
(gauge)
temporal.server.namespace_dlq.max_level
(gauge)
temporal.server.namespace_registry.lock_latency.bucket
(count)
temporal.server.namespace_registry.lock_latency.count
(count)
temporal.server.namespace_registry.lock_latency.sum
(count)

Shown as millisecond
temporal.server.namespace_replication.dlq_enqueue_requests.count
(count)
temporal.server.namespace_replication.task_ack_level
(gauge)
temporal.server.new_timer_notifications.count
(count)
temporal.server.no_poller_tasks.count
(count)
Tasks added to a task queue that has no poller.
temporal.server.numshards
(gauge)
temporal.server.parent_close_policy_processor.errors.count
(count)
temporal.server.parent_close_policy_processor.requests.count
(count)
temporal.server.pending_tasks.bucket
(count)
temporal.server.pending_tasks.count
(count)
temporal.server.pending_tasks.sum
(count)
temporal.server.persistence.error_with_type.count
(count)
Number of persistence errors, tagged by error type.
temporal.server.persistence.errors.count
(count)
Number of persistence errors.
temporal.server.persistence.errors.resource_exhausted.count
(count)
temporal.server.persistence.latency.bucket
(count)
Distribution of latencies on persistence operations.
temporal.server.persistence.latency.count
(count)
Count of latencies on persistence operations.
temporal.server.persistence.latency.sum
(count)
Sum of latencies on persistence operations.
Shown as millisecond
temporal.server.persistence.requests.count
(count)
Number of persistence requests.
temporal.server.poll.success.count
(count)
Number of tasks successfully matched by the poller.
temporal.server.poll.success_sync.count
(count)
Number of tasks successfully matched synchronously.
temporal.server.poll.timeouts.count
(count)
Number of times a poller timed out due to no tasks being available.
temporal.server.query_before_first_workflow_task.count
(count)
temporal.server.query_buffer_exceeded.count
(count)
temporal.server.query_registry_invalid_state.count
(count)
temporal.server.queue.actions.count
(count)
temporal.server.queue.latency_schedule.bucket
(count)
temporal.server.queue.latency_schedule.count
(count)
temporal.server.queue.latency_schedule.sum
(count)

Shown as millisecond
temporal.server.queue.reader.bucket
(count)
temporal.server.queue.reader.count
(count)
temporal.server.queue.reader.sum
(count)
temporal.server.queue.slice.bucket
(count)
temporal.server.queue.slice.count
(count)
temporal.server.queue.slice.sum
(count)
temporal.server.rate_limiter.failures.count
(count)
temporal.server.read_namespace.failures.count
(count)
temporal.server.record_marker_command.count
(count)
temporal.server.reject_workflow_update_message.count
(count)
temporal.server.remote_to_local.matches.count
(count)
temporal.server.remote_to_remote.matches.count
(count)
temporal.server.remove_engine_for_shard.latency.bucket
(count)
temporal.server.remove_engine_for_shard.latency.count
(count)
temporal.server.remove_engine_for_shard.latency.sum
(count)

Shown as millisecond
temporal.server.rename_namespace.failures.count
(count)
temporal.server.rename_namespace.success.count
(count)
temporal.server.replication.dlq.ack_level
(gauge)
temporal.server.replication.dlq.enqueue_failed.count
(count)
temporal.server.replication.dlq.max_level
(gauge)
temporal.server.replication.latency.bucket
(count)
temporal.server.replication.latency.count
(count)
temporal.server.replication.latency.sum
(count)

Shown as millisecond
temporal.server.replication.task_cleanup.count
(count)
temporal.server.replication.task_cleanup.failed.count
(count)
temporal.server.replication.tasks.applied.count
(count)
temporal.server.replication.tasks.applied_latency.bucket
(count)
temporal.server.replication.tasks.applied_latency.count
(count)
temporal.server.replication.tasks.applied_latency.sum
(count)

Shown as millisecond
temporal.server.replication.tasks.failed.count
(count)
temporal.server.replication.tasks.fetched.bucket
(count)
temporal.server.replication.tasks.fetched.count
(count)
temporal.server.replication.tasks.fetched.sum
(count)

Shown as millisecond
temporal.server.replication.tasks.lag.bucket
(count)
temporal.server.replication.tasks.lag.count
(count)
temporal.server.replication.tasks.lag.sum
(count)

Shown as millisecond
temporal.server.replication.tasks.returned.bucket
(count)
temporal.server.replication.tasks.returned.count
(count)
temporal.server.replication.tasks.returned.sum
(count)

Shown as millisecond
temporal.server.replication_events_size.bucket
(count)
temporal.server.replication_events_size.count
(count)
temporal.server.replication_events_size.sum
(count)

Shown as millisecond
temporal.server.replicator.dlq_enqueue_fails.count
(count)
temporal.server.replicator.errors.count
(count)
temporal.server.replicator.latency.bucket
(count)
temporal.server.replicator.latency.count
(count)
temporal.server.replicator.latency.sum
(count)

Shown as millisecond
temporal.server.replicator.messages.count
(count)
temporal.server.request_cancel_info.bucket
(count)
temporal.server.request_cancel_info.count
(count)
temporal.server.request_cancel_info.size.bucket
(count)
temporal.server.request_cancel_info.size.count
(count)
temporal.server.request_cancel_info.size.sum
(count)

Shown as byte
temporal.server.request_cancel_info.sum
(count)
temporal.server.respond_query_failed.count
(count)
temporal.server.scan_duration.bucket
(count)
temporal.server.scan_duration.count
(count)
temporal.server.scan_duration.sum
(count)

Shown as millisecond
temporal.server.scavenger.errors.count
(count)
temporal.server.scavenger.skips.count
(count)
temporal.server.scavenger.success.count
(count)
temporal.server.scavenger.validation.failures.count
(count)
temporal.server.scavenger.validation.requests.count
(count)
temporal.server.scavenger.validation.skips.count
(count)
temporal.server.schedule.action.errors.count
(count)
temporal.server.schedule.action.success.count
(count)
temporal.server.schedule.buffer_overruns.count
(count)
temporal.server.schedule.cancel_workflow.errors.count
(count)
temporal.server.schedule.missed_catchup_window.count
(count)
temporal.server.schedule.rate_limited.count
(count)
temporal.server.schedule.terminate_workflow.errors.count
(count)
temporal.server.schedule_activity_command.count
(count)
temporal.server.schedule_to_close.timeout.count
(count)
temporal.server.schedule_to_start.timeout.count
(count)
temporal.server.search_attributes_size.bucket
(count)
temporal.server.search_attributes_size.count
(count)
temporal.server.search_attributes_size.sum
(count)

Shown as byte
temporal.server.service.authorization_latency.bucket
(count)
temporal.server.service.authorization_latency.count
(count)
temporal.server.service.authorization_latency.sum
(count)

Shown as millisecond
temporal.server.service.error_with_type.count
(count)
Errors encountered by the service, tagged by type.
temporal.server.service.errors.count
(count)
temporal.server.service.errors.critical.count
(count)
temporal.server.service.errors.resource_exhausted.count
(count)
temporal.server.service.errors.shard_ownership_lost.count
(count)
temporal.server.service.errors.task_already_started.count
(count)
temporal.server.service.latency.bucket
(count)
Distribution of latencies for all client request operations.
temporal.server.service.latency.count
(count)
Count of latencies for all client request operations.
temporal.server.service.latency.nouserlatency.bucket
(count)
temporal.server.service.latency.nouserlatency.count
(count)
temporal.server.service.latency.nouserlatency.sum
(count)

Shown as millisecond
temporal.server.service.latency.sum
(count)
Sum of latencies for all client requests operations.
Shown as millisecond
temporal.server.service.latency.userlatency.bucket
(count)
temporal.server.service.latency.userlatency.count
(count)
temporal.server.service.latency.userlatency.sum
(count)

Shown as millisecond
temporal.server.service.pending_requests
(gauge)
temporal.server.service.requests.count
(count)
Service requests received per Task Queue.
temporal.server.shard.lock_latency.bucket
(count)
temporal.server.shard.lock_latency.count
(count)
temporal.server.shard.lock_latency.sum
(count)

Shown as millisecond
temporal.server.shard_closed.count
(count)
temporal.server.shard_controller.lock_latency.bucket
(count)
temporal.server.shard_controller.lock_latency.count
(count)
temporal.server.shard_controller.lock_latency.sum
(count)

Shown as millisecond
temporal.server.shardinfo.immediate_queue.lag.bucket
(count)
temporal.server.shardinfo.immediate_queue.lag.count
(count)
temporal.server.shardinfo.immediate_queue.lag.sum
(count)
temporal.server.shardinfo.replication.lag.bucket
(count)
temporal.server.shardinfo.replication.lag.count
(count)
temporal.server.shardinfo.replication.lag.sum
(count)
temporal.server.shardinfo.replication.pending_task.bucket
(count)
temporal.server.shardinfo.replication.pending_task.count
(count)
temporal.server.shardinfo.replication.pending_task.sum
(count)
temporal.server.shardinfo.scheduled_queue.lag.bucket
(count)
temporal.server.shardinfo.scheduled_queue.lag.count
(count)
temporal.server.shardinfo.scheduled_queue.lag.sum
(count)

Shown as millisecond
temporal.server.shardinfo.timer.active.pending_task.bucket
(count)
temporal.server.shardinfo.timer.active.pending_task.count
(count)
temporal.server.shardinfo.timer.active.pending_task.sum
(count)
temporal.server.shardinfo.timer.lag.bucket
(count)
temporal.server.shardinfo.timer.lag.count
(count)
temporal.server.shardinfo.timer.lag.sum
(count)

Shown as millisecond
temporal.server.shardinfo.timer.standby.pending_task.bucket
(count)
temporal.server.shardinfo.timer.standby.pending_task.count
(count)
temporal.server.shardinfo.timer.standby.pending_task.sum
(count)
temporal.server.shardinfo.transfer.active.pending_task.bucket
(count)
temporal.server.shardinfo.transfer.active.pending_task.count
(count)
temporal.server.shardinfo.transfer.active.pending_task.sum
(count)
temporal.server.shardinfo.transfer.lag.bucket
(count)
temporal.server.shardinfo.transfer.lag.count
(count)
temporal.server.shardinfo.transfer.lag.sum
(count)
temporal.server.shardinfo.transfer.standby.pending_task.bucket
(count)
temporal.server.shardinfo.transfer.standby.pending_task.count
(count)
temporal.server.shardinfo.transfer.standby.pending_task.sum
(count)
temporal.server.shardinfo.visibility.lag.bucket
(count)
temporal.server.shardinfo.visibility.lag.count
(count)
temporal.server.shardinfo.visibility.lag.sum
(count)
temporal.server.shardinfo.visibility.pending_task.bucket
(count)
temporal.server.shardinfo.visibility.pending_task.count
(count)
temporal.server.shardinfo.visibility.pending_task.sum
(count)
temporal.server.sharditem.acquisition_latency.bucket
(count)
temporal.server.sharditem.acquisition_latency.count
(count)
temporal.server.sharditem.acquisition_latency.sum
(count)

Shown as millisecond
temporal.server.sharditem.created.count
(count)
temporal.server.sharditem.removed.count
(count)
temporal.server.signal_external_workflow_command.count
(count)
temporal.server.signal_info.bucket
(count)
temporal.server.signal_info.count
(count)
temporal.server.signal_info.size.bucket
(count)
temporal.server.signal_info.size.count
(count)
temporal.server.signal_info.size.sum
(count)

Shown as byte
temporal.server.signal_info.sum
(count)
temporal.server.stale_mutable_state.count
(count)
temporal.server.stale_replication_events.count
(count)
temporal.server.start_timer_command.count
(count)
temporal.server.start_to_close.timeout.count
(count)
temporal.server.started.count
(count)
temporal.server.state_transition.bucket
(count)
temporal.server.state_transition.count
(count)
temporal.server.state_transition.sum
(count)
temporal.server.stopped.count
(count)
temporal.server.sync_throttle.count
(count)
temporal.server.syncmatch.latency.bucket
(count)
temporal.server.syncmatch.latency.count
(count)
temporal.server.syncmatch.latency.sum
(count)

Shown as millisecond
temporal.server.syncshard.remote.count
(count)
temporal.server.syncshard.remote.failed.count
(count)
temporal.server.task.attempt.bucket
(count)
Distribution of the number of attempts on each task execution.
temporal.server.task.attempt.count
(count)
Count of the number of attempts on each task execution.
temporal.server.task.attempt.sum
(count)
Sum of the number of attempts on each task execution.
temporal.server.task.batch_complete_counter.count
(count)
temporal.server.task.bucket
(count)
temporal.server.task.count
(count)
temporal.server.task.deleted
(gauge)
temporal.server.task.dependency_task_not_completed.count
(count)
temporal.server.task.errors.corruption.count
(count)
temporal.server.task.errors.count
(count)
Number of task process errors.
temporal.server.task.errors.discarded.count
(count)
temporal.server.task.errors.limit_exceeded_counter.count
(count)
temporal.server.task.errors.namespace_handover.count
(count)
temporal.server.task.errors.not_active_counter.count
(count)
temporal.server.task.errors.standby_retry_counter.count
(count)
temporal.server.task.errors.throttled.count
(count)
temporal.server.task.errors.version_mismatch.count
(count)
temporal.server.task.errors.workflow_busy.count
(count)
temporal.server.task.lag_per_tl
(gauge)
temporal.server.task.latency.bucket
(count)
Distribution of in-memory latencies across multiple attempts.
temporal.server.task.latency.count
(count)
Count of in-memory latencies across multiple attempts.
temporal.server.task.latency.load.bucket
(count)
Distribution of durations from task generation to task loading.
temporal.server.task.latency.load.count
(count)
Count of durations from task generation to task loading.
temporal.server.task.latency.load.sum
(count)
Sum of durations from task generation to task loading.
Shown as millisecond
temporal.server.task.latency.processing.bucket
(count)
Distribution of task processing latencies per attempt.
temporal.server.task.latency.processing.count
(count)
Count of task processing latencies per attempt.
temporal.server.task.latency.processing.sum
(count)
Sum of task processing latencies per attempt.
Shown as millisecond
temporal.server.task.latency.queue.bucket
(count)
Distribution of durations from when a task is fired to when the task is done.
temporal.server.task.latency.queue.count
(count)
Count of durations from when a task is fired to when the task is done.
temporal.server.task.latency.queue.sum
(count)
Sum of durations from when a task is fired to when the task is done.
Shown as millisecond
temporal.server.task.latency.schedule.bucket
(count)
Distribution of durations from task submission to processing.
temporal.server.task.latency.schedule.count
(count)
Count of durations from task submission to processing.
temporal.server.task.latency.schedule.sum
(count)
Sum of durations from task submission to processing.
Shown as millisecond
temporal.server.task.latency.sum
(count)
Sum of in-memory latencies across multiple attempts.
Shown as millisecond
temporal.server.task.latency.user.bucket
(count)
temporal.server.task.latency.user.count
(count)
temporal.server.task.latency.user.sum
(count)

Shown as millisecond
temporal.server.task.processed
(gauge)
temporal.server.task.requests.count
(count)
Number of task process requests.
temporal.server.task.schedule_to_start_latency.bucket
(count)
temporal.server.task.schedule_to_start_latency.count
(count)
temporal.server.task.schedule_to_start_latency.sum
(count)

Shown as millisecond
temporal.server.task.skipped.count
(count)
temporal.server.task.sum
(count)
temporal.server.task.write.latency.bucket
(count)
temporal.server.task.write.latency.count
(count)
temporal.server.task.write.latency.sum
(count)

Shown as millisecond
temporal.server.task.write.throttle.count
(count)
temporal.server.task_queue.started.count
(count)
temporal.server.task_queue.stopped.count
(count)
temporal.server.task_rescheduler.pending_tasks.bucket
(count)
temporal.server.task_rescheduler.pending_tasks.count
(count)
temporal.server.task_rescheduler.pending_tasks.sum
(count)
temporal.server.task_scheduler.throttled.count
(count)
temporal.server.taskqueue.deleted
(gauge)
temporal.server.taskqueue.outstanding
(gauge)
temporal.server.taskqueue.processed
(gauge)
temporal.server.tasks_expired.count
(count)
temporal.server.timer_info.bucket
(count)
temporal.server.timer_info.count
(count)
temporal.server.timer_info.size.bucket
(count)
temporal.server.timer_info.size.count
(count)
temporal.server.timer_info.size.sum
(count)

Shown as byte
temporal.server.timer_info.sum
(count)
temporal.server.transfer_task.missing_event_counter.count
(count)
temporal.server.unbuffer_replication_tasks.bucket
(count)
temporal.server.unbuffer_replication_tasks.count
(count)
temporal.server.unbuffer_replication_tasks.sum
(count)

Shown as millisecond
temporal.server.update_namespace.failures.count
(count)
temporal.server.upsert_workflow_search_attributes_command.count
(count)
temporal.server.version_check.failed.count
(count)
temporal.server.version_check.latency.bucket
(count)
temporal.server.version_check.latency.count
(count)
temporal.server.version_check.latency.sum
(count)

Shown as millisecond
temporal.server.version_check.request_failed.count
(count)
temporal.server.version_check.success.count
(count)
temporal.server.visibility.archiver.archive.non_retryable_error.count
(count)
temporal.server.visibility.archiver.archive.success.count
(count)
temporal.server.visibility.archiver.archive.transient_error.count
(count)
temporal.server.visibility.persistence.error_with_type.count
(count)
temporal.server.visibility.persistence.errors.count
(count)
temporal.server.visibility.persistence.latency.bucket
(count)
temporal.server.visibility.persistence.latency.count
(count)
temporal.server.visibility.persistence.latency.sum
(count)

Shown as millisecond
temporal.server.visibility.persistence.requests.count
(count)
temporal.server.visibility.persistence.resource_exhausted.count
(count)
temporal.server.wf_too_many_pending.activities.count
(count)
temporal.server.wf_too_many_pending.cancel_requests.count
(count)
temporal.server.wf_too_many_pending.child_workflows.count
(count)
temporal.server.wf_too_many_pending.external_workflow_signals.count
(count)
temporal.server.worker_not_supports_consistent_query.count
(count)
temporal.server.workflow.cancel.count
(count)
temporal.server.workflow.cleanup.archive.count
(count)
temporal.server.workflow.cleanup.delete.count
(count)
temporal.server.workflow.cleanup.delete_history_inline.count
(count)
temporal.server.workflow.cleanup.nop.count
(count)
temporal.server.workflow.context_cleared.count
(count)
temporal.server.workflow.continued_as_new.count
(count)
temporal.server.workflow.cron_backoff.timer.count
(count)
temporal.server.workflow.eager_execution.count
(count)
temporal.server.workflow.eager_execution.denied.count
(count)
temporal.server.workflow.failed.count
(count)
temporal.server.workflow.retry_backoff.timer.count
(count)
temporal.server.workflow.run_timeout_overrides.count
(count)
temporal.server.workflow.success.count
(count)
temporal.server.workflow.task.attempt.bucket
(count)
temporal.server.workflow.task.attempt.count
(count)
temporal.server.workflow.task.attempt.sum
(count)
temporal.server.workflow.task.heartbeat_timeout.count
(count)
temporal.server.workflow.task.query_latency.bucket
(count)
temporal.server.workflow.task.query_latency.count
(count)
temporal.server.workflow.task.query_latency.sum
(count)

Shown as millisecond
temporal.server.workflow.task.timeout_overrides.count
(count)
temporal.server.workflow.terminate.count
(count)
temporal.server.workflow.timeout.count
(count)

Events

The Temporal integration does not include any events.

Service Checks

temporal.server.openmetrics.health
Returns CRITICAL if the check cannot access the Prometheus metrics endpoint of the Temporal instance.
Statuses: ok, critical

Logs

The Temporal integration can collect logs from the Temporal Cluster and forward them to Datadog.

Troubleshooting

Need help? Contact Datadog support.

Further Reading

Additional helpful documentation, links, and articles: