Supported OS Linux Windows

Versión de la integración3.2.0

Información general

Este check monitoriza Temporal a través del Datadog Agent.

Nota: Este check solo puede instalarse si autoalojas Temporal. Para monitorizar tu instancia de Temporal Cloud, consulta la documentación de la integración Datadog Temporal Cloud.

Configuración

Sigue las instrucciones a continuación para instalar y configurar este check para un Agent que se ejecuta en un host. Para entornos en contenedores, consulta las plantillas de integración de Autodiscovery para obtener orientación sobre la aplicación de estas instrucciones.

Instalación

El check de Temporal está incluido en el paquete del Datadog Agent. No es necesaria ninguna instalación adicional en tu servidor.

Configuración

Host

Recopilación de métricas
  1. Configura tus servicios Temporal para exponer métricas a través de un endpoint prometheus siguiendo la documentación oficial de Temporal.

  2. Edita el archivo temporal.d/conf.yaml en la carpeta conf.d/ en la raíz de tu directorio de configuración del Agent para comenzar a recopilar tus datos de rendimiento de Temporal.

    Configura la opción openmetrics_endpoint para que coincida con las opciones listenAddress y handlerPath de configuración de tu servidor Temporal.

    init_config:
    instances:
      - openmetrics_endpoint: <LISTEN_ADDRESS>/<HANDLER_PATH>
    

    Ten en cuenta que cuando los servicios Temporal de un clúster se despliegan de forma independiente, cada servicio expone sus propias métricas. Como resultado, es necesario configurar el endpoint prometheus para cada servicio que quieras monitorizar y definir una instance separada en la configuración de la integración para cada uno de ellos.

Recopilación de logs
  1. La recopilación de logs está desactivada por defecto en el Datadog Agent. Actívala en tu archivo datadog.yaml:

    logs_enabled: true
    
  2. Configura tu clúster de Temporal para que envíe logs a un archivo. Para ello consulta la documentación oficial.

  3. Descomenta y edita el bloque de configuración de logs en tu archivo temporal.d/conf.yaml y configura la path para que apunte al archivo que configuraste en tu clúster de Temporal:

logs:
  - type: file
    path: /var/log/temporal/temporal-server.log
    source: temporal
  1. Reinicia el Agent.

Contenedores

Recopilación de métricas

Para entornos contenedorizados, consulta Configurar integraciones con Autodiscovery en Kubernetes o Configurar integraciones con Autodiscovery en Docker para obtener instrucciones sobre el uso de los parámetros a continuación. Consulta el temporal.d/conf.yaml de ejemplo para ver todas las opciones de configuración disponibles.

ParámetroValor
<INTEGRATION_NAME>temporal
<INIT_CONFIG>en blanco o {}
<INSTANCES_CONFIG>{"openmetrics_endpoint": "<LISTEN_ADDRESS>/<HANDLER_PATH>"}, donde <LISTEN_ADDRESS> y <HANDLER_PATH> se sustituyen por listenAddress y handlerPath de la configuración de tu servidor Temporal.

Ten en cuenta que cuando los servicios Temporal de un clúster se despliegan de forma independiente, cada servicio expone sus propias métricas. Como resultado, es necesario configurar el endpoint prometheus para cada servicio que quieras monitorizar y definir una instance separada en la configuración de la integración para cada uno de ellos.

Ejemplo

La siguiente anotación Kubernetes se aplica a un pod en metadata, donde <CONTAINER_NAME> es el nombre de tu contenedor Temporal (o un identificador personalizado):

ad.datadoghq.com/<CONTAINER_NAME>.checks: |
  {
    "temporal": {
      "init_config": {},
      "instances": [{"openmetrics_endpoint": "<LISTEN_ADDRESS>/<HANDLER_PATH>"}]
    }
  } 
Recopilación de logs

La recopilación de logs está deshabilitada por defecto en el Datadog Agent. Para habilitarla, consulta Recopilación de logs de Docker o Recopilación de logs de Kubernetes.

Aplica el siguiente parámetro de configuración a logs:

ParámetroValor
<LOG_CONFIG>{"source": "temporal", "type": "file", "path": "/var/log/temporal/temporal-server.log"}

Ejemplo

La siguiente anotación Kubernetes se aplica a un pod en metadata, donde <CONTAINER_NAME> es el nombre de tu contenedor Temporal (o un identificador personalizado):

ad.datadoghq.com/<CONTAINER_NAME>.logs: |
  [
    {
      "source": "temporal",
      "type": "file",
      "path": "/var/log/temporal/temporal-server.log"
    } 
  ]

Validación

Ejecuta el subcomando de estado del Agent y busca temporal en la sección Checks.

Datos recopilados

Métricas

temporal.server.accept_workflow_update_message.count
(count)
temporal.server.ack_level.update.count
(count)
temporal.server.ack_level.update.failed.count
(count)
temporal.server.acquire_lock_failed.count
(count)
temporal.server.acquire_shards.count
(count)
temporal.server.acquire_shards.latency.bucket
(count)
temporal.server.acquire_shards.latency.count
(count)
temporal.server.acquire_shards.latency.sum
(count)

Shown as millisecond
temporal.server.action.count
(count)
temporal.server.activity.eager_execution.count
(count)
temporal.server.activity.end_to_end_latency.bucket
(count)
temporal.server.activity.end_to_end_latency.count
(count)
temporal.server.activity.end_to_end_latency.sum
(count)

Shown as millisecond
temporal.server.activity_info.bucket
(count)
temporal.server.activity_info.count
(count)
temporal.server.activity_info.size.bucket
(count)
temporal.server.activity_info.size.count
(count)
temporal.server.activity_info.size.sum
(count)

Shown as byte
temporal.server.activity_info.sum
(count)
temporal.server.add_search_attributes.failures.count
(count)
temporal.server.add_search_attributes.workflow_failure
(gauge)
temporal.server.add_search_attributes.workflow_failure.count
(count)
temporal.server.add_search_attributes.workflow_success.count
(count)
temporal.server.archival.task_invalid_uri.count
(count)
temporal.server.archiver.archive.latency.bucket
(count)
temporal.server.archiver.archive.latency.count
(count)
temporal.server.archiver.archive.latency.sum
(count)

Shown as millisecond
temporal.server.archiver.archive.target_latency.bucket
(count)
temporal.server.archiver.archive.target_latency.count
(count)
temporal.server.archiver.archive.target_latency.sum
(count)

Shown as millisecond
temporal.server.archiver.backlog.size.count
(count)
temporal.server.archiver.client.history.inline_archive.attempt.count
(count)
temporal.server.archiver.client.history.inline_archive.failure.count
(count)
temporal.server.archiver.client.history.request.count
(count)
temporal.server.archiver.client.send_signal_error.count
(count)
temporal.server.archiver.client.sent_signal.count
(count)
temporal.server.archiver.client.visibility.inline_archive_attempt.count
(count)
temporal.server.archiver.client.visibility.inline_archive_failure.count
(count)
temporal.server.archiver.client.visibility.request.count
(count)
temporal.server.archiver.coroutine.started.count
(count)
temporal.server.archiver.coroutine.stopped.count
(count)
temporal.server.archiver.delete.failed_all_retries.count
(count)
temporal.server.archiver.delete.success.count
(count)
temporal.server.archiver.delete.with_retries.latency.bucket
(count)
temporal.server.archiver.delete.with_retries.latency.count
(count)
temporal.server.archiver.delete.with_retries.latency.sum
(count)

Shown as millisecond
temporal.server.archiver.handle_all_requests_latency.bucket
(count)
temporal.server.archiver.handle_all_requests_latency.count
(count)
temporal.server.archiver.handle_all_requests_latency.sum
(count)

Shown as millisecond
temporal.server.archiver.handle_history.request.latency.bucket
(count)
temporal.server.archiver.handle_history.request.latency.count
(count)
temporal.server.archiver.handle_history.request.latency.sum
(count)

Shown as millisecond
temporal.server.archiver.handle_visibility.failed_all_retries.count
(count)
temporal.server.archiver.handle_visibility.request.latency.bucket
(count)
temporal.server.archiver.handle_visibility.request.latency.count
(count)
temporal.server.archiver.handle_visibility.request.latency.sum
(count)

Shown as millisecond
temporal.server.archiver.handle_visibility.success.count
(count)
temporal.server.archiver.non_retryable_error.count
(count)
temporal.server.archiver.num_handled_requests.count
(count)
temporal.server.archiver.num_pumped_requests.count
(count)
temporal.server.archiver.pump.signal_channel_closed.count
(count)
temporal.server.archiver.pump.signal_threshold.count
(count)
temporal.server.archiver.pump.timeout.count
(count)
temporal.server.archiver.pump.timeout_without_signals.count
(count)
temporal.server.archiver.pumped_not_equal_handled.count
(count)
temporal.server.archiver.started.count
(count)
temporal.server.archiver.stopped.count
(count)
temporal.server.archiver.upload.failed_all_retries.count
(count)
temporal.server.archiver.upload.success.count
(count)
temporal.server.archiver.upload.with_retries.latency.bucket
(count)
temporal.server.archiver.upload.with_retries.latency.count
(count)
temporal.server.archiver.upload.with_retries.latency.sum
(count)

Shown as millisecond
temporal.server.archiver.workflow.started.count
(count)
temporal.server.archiver.workflow.stopping.count
(count)
temporal.server.asyncmatch.latency.bucket
(count)
Distribution of latencies from creation to delivery for async matched tasks.
temporal.server.asyncmatch.latency.count
(count)
Count of latencies from creation to delivery for async matched tasks.
temporal.server.asyncmatch.latency.sum
(count)
Sum of time latencies creation to delivery for async matched tasks.
Shown as millisecond
temporal.server.auto_reset_point.corruption.count
(count)
temporal.server.auto_reset_points.exceed_limit.count
(count)
temporal.server.batcher.operation_errors.count
(count)
temporal.server.batcher.processor_errors.count
(count)
temporal.server.batcher.processor_requests.count
(count)
temporal.server.buffer_replication_tasks.bucket
(count)
temporal.server.buffer_replication_tasks.count
(count)
temporal.server.buffer_replication_tasks.sum
(count)

Shown as millisecond
temporal.server.buffer_throttle.count
(count)
temporal.server.buffered_events.bucket
(count)
temporal.server.buffered_events.count
(count)
temporal.server.buffered_events.size.bucket
(count)
temporal.server.buffered_events.size.count
(count)
temporal.server.buffered_events.size.sum
(count)

Shown as byte
temporal.server.buffered_events.sum
(count)
temporal.server.cache.errors.count
(count)
temporal.server.cache.latency.bucket
(count)
temporal.server.cache.latency.count
(count)
temporal.server.cache.latency.sum
(count)

Shown as millisecond
temporal.server.cache.miss.count
(count)
temporal.server.cache.requests.count
(count)
temporal.server.cancel_activity_command.count
(count)
temporal.server.cancel_external_workflow_command.count
(count)
temporal.server.cancel_timer_command.count
(count)
temporal.server.cancel_workflow_command.count
(count)
temporal.server.catchup.ready_shard_count
(gauge)
temporal.server.certificates.expired
(gauge)
temporal.server.certificates.expiring
(gauge)
temporal.server.child_info.bucket
(count)
temporal.server.child_info.count
(count)
temporal.server.child_info.size.bucket
(count)
temporal.server.child_info.size.count
(count)
temporal.server.child_info.size.sum
(count)

Shown as byte
temporal.server.child_info.sum
(count)
temporal.server.child_workflow_command.count
(count)
temporal.server.client.errors.count
(count)
An indicator for connection issues between different Server roles.
temporal.server.client.latency.bucket
(count)
temporal.server.client.latency.count
(count)
temporal.server.client.latency.sum
(count)

Shown as millisecond
temporal.server.client.redirection.errors.count
(count)
temporal.server.client.redirection.latency.bucket
(count)
temporal.server.client.redirection.latency.count
(count)
temporal.server.client.redirection.latency.sum
(count)

Shown as millisecond
temporal.server.client.redirection.requests.count
(count)
temporal.server.client.requests.count
(count)
temporal.server.closed_workflow_buffer_event_counter.count
(count)
temporal.server.cluster_metadata.callback.lock_latency.bucket
(count)
temporal.server.cluster_metadata.callback.lock_latency.count
(count)
temporal.server.cluster_metadata.callback.lock_latency.sum
(count)

Shown as millisecond
temporal.server.cluster_metadata.lock_latency.bucket
(count)
temporal.server.cluster_metadata.lock_latency.count
(count)
temporal.server.cluster_metadata.lock_latency.sum
(count)

Shown as millisecond
temporal.server.complete_task_fail.count
(count)
temporal.server.complete_workflow_command.count
(count)
temporal.server.complete_workflow_task_sticky.disabled.count
(count)
temporal.server.complete_workflow_task_sticky.enabled.count
(count)
temporal.server.complete_workflow_update_message.count
(count)
temporal.server.concurrency_update_failure.count
(count)
temporal.server.condition_failed_errors.count
(count)
temporal.server.consistent_query_timeout.count
(count)
temporal.server.continue_as_new_command.count
(count)
temporal.server.count_executions.failures.count
(count)
temporal.server.delete_execution.failures.count
(count)
temporal.server.delete_execution.not_found.count
(count)
temporal.server.delete_executions.success.count
(count)
temporal.server.delete_namespace.failures.count
(count)
temporal.server.delete_namespace.success.count
(count)
temporal.server.delete_namespace.workflow_failure.count
(count)
temporal.server.delete_namespace.workflow_success.count
(count)
temporal.server.direct_query_dispatch.clear_stickiness.latency.bucket
(count)
temporal.server.direct_query_dispatch.clear_stickiness.latency.count
(count)
temporal.server.direct_query_dispatch.clear_stickiness.latency.sum
(count)

Shown as millisecond
temporal.server.direct_query_dispatch.clear_stickiness.success.count
(count)
temporal.server.direct_query_dispatch.latency.bucket
(count)
temporal.server.direct_query_dispatch.latency.count
(count)
temporal.server.direct_query_dispatch.latency.sum
(count)

Shown as millisecond
temporal.server.direct_query_dispatch.non_sticky.latency.bucket
(count)
temporal.server.direct_query_dispatch.non_sticky.latency.count
(count)
temporal.server.direct_query_dispatch.non_sticky.latency.sum
(count)

Shown as millisecond
temporal.server.direct_query_dispatch.non_sticky.success.count
(count)
temporal.server.direct_query_dispatch.sticky.latency.bucket
(count)
temporal.server.direct_query_dispatch.sticky.latency.count
(count)
temporal.server.direct_query_dispatch.sticky.latency.sum
(count)

Shown as millisecond
temporal.server.direct_query_dispatch.sticky.success.count
(count)
temporal.server.direct_query_dispatch.timeout_before_non_sticky.count
(count)
temporal.server.duplicate_replication_events.count
(count)
temporal.server.elasticsearch.bulk_processor.bulk_size.bucket
(count)
temporal.server.elasticsearch.bulk_processor.bulk_size.count
(count)
temporal.server.elasticsearch.bulk_processor.bulk_size.sum
(count)
temporal.server.elasticsearch.bulk_processor.commit.latency.bucket
(count)
temporal.server.elasticsearch.bulk_processor.commit.latency.count
(count)
temporal.server.elasticsearch.bulk_processor.commit.latency.sum
(count)

Shown as millisecond
temporal.server.elasticsearch.bulk_processor.corrupted_data.count
(count)
temporal.server.elasticsearch.bulk_processor.duplicate_request.count
(count)
temporal.server.elasticsearch.bulk_processor.errors.count
(count)
temporal.server.elasticsearch.bulk_processor.queued_requests.bucket
(count)
temporal.server.elasticsearch.bulk_processor.queued_requests.count
(count)
temporal.server.elasticsearch.bulk_processor.queued_requests.sum
(count)
temporal.server.elasticsearch.bulk_processor.request.latency.bucket
(count)
temporal.server.elasticsearch.bulk_processor.request.latency.count
(count)
temporal.server.elasticsearch.bulk_processor.request.latency.sum
(count)

Shown as millisecond
temporal.server.elasticsearch.bulk_processor.requests.count
(count)
temporal.server.elasticsearch.bulk_processor.wait_add.latency.bucket
(count)
temporal.server.elasticsearch.bulk_processor.wait_add.latency.count
(count)
temporal.server.elasticsearch.bulk_processor.wait_add.latency.sum
(count)

Shown as millisecond
temporal.server.elasticsearch.bulk_processor.wait_start.latency.bucket
(count)
temporal.server.elasticsearch.bulk_processor.wait_start.latency.count
(count)
temporal.server.elasticsearch.bulk_processor.wait_start.latency.sum
(count)

Shown as millisecond
temporal.server.elasticsearch.document.generate_failures_counter.count
(count)
temporal.server.elasticsearch.document.parse_failures_counter.count
(count)
temporal.server.empty_completion_commands.count
(count)
temporal.server.empty_replication_events.count
(count)
temporal.server.event.blob_size.bucket
(count)
temporal.server.event.blob_size.count
(count)
temporal.server.event.blob_size.sum
(count)

Shown as byte
temporal.server.event.reapply_skipped.count
(count)
temporal.server.execution_info.size.bucket
(count)
temporal.server.execution_info.size.count
(count)
temporal.server.execution_info.size.sum
(count)

Shown as byte
temporal.server.execution_state.size.bucket
(count)
temporal.server.execution_state.size.count
(count)
temporal.server.execution_state.size.sum
(count)

Shown as byte
temporal.server.executions_outstanding
(gauge)
temporal.server.executor.deferred.count
(count)
temporal.server.executor.done.count
(count)
temporal.server.executor.dropped.count
(count)
temporal.server.executor.err.count
(count)
temporal.server.fail_workflow_command.count
(count)
temporal.server.failed_workflow_tasks.count
(count)
temporal.server.forward_poll.calls.count
(count)
temporal.server.forward_poll.errors.count
(count)
temporal.server.forward_poll.latency.bucket
(count)
temporal.server.forward_poll.latency.count
(count)
temporal.server.forward_poll.latency.sum
(count)

Shown as millisecond
temporal.server.forward_query.calls.count
(count)
temporal.server.forward_query.errors.count
(count)
temporal.server.forward_query.latency.bucket
(count)
temporal.server.forward_query.latency.count
(count)
temporal.server.forward_query.latency.sum
(count)

Shown as millisecond
temporal.server.forward_task.calls.count
(count)
temporal.server.forward_task.errors.count
(count)
temporal.server.forward_task.latency.bucket
(count)
temporal.server.forward_task.latency.count
(count)
temporal.server.forward_task.latency.sum
(count)

Shown as millisecond
temporal.server.forwarded.count
(count)
temporal.server.forwarded_per_tl.count
(count)
temporal.server.get_dlq_replication_messages.bucket
(count)
temporal.server.get_dlq_replication_messages.count
(count)
temporal.server.get_dlq_replication_messages.sum
(count)

Shown as millisecond
temporal.server.get_engine_for_shard.errors.count
(count)
temporal.server.get_engine_for_shard.latency.bucket
(count)
temporal.server.get_engine_for_shard.latency.count
(count)
temporal.server.get_engine_for_shard.latency.sum
(count)

Shown as millisecond
temporal.server.get_replication_messages_for_shard.bucket
(count)
temporal.server.get_replication_messages_for_shard.count
(count)
temporal.server.get_replication_messages_for_shard.sum
(count)

Shown as millisecond
temporal.server.handover.ready_shard_count
(gauge)
temporal.server.heartbeat.timeout.count
(count)
temporal.server.history.archiver.archive.non_retryable_error.count
(count)
temporal.server.history.archiver.archive.success.count
(count)
temporal.server.history.archiver.archive.transient_error.count
(count)
temporal.server.history.archiver.blob_exists.count
(count)
temporal.server.history.archiver.blob_integrity_check_failed.count
(count)
temporal.server.history.archiver.blob_size.bucket
(count)
temporal.server.history.archiver.blob_size.count
(count)
temporal.server.history.archiver.blob_size.sum
(count)

Shown as byte
temporal.server.history.archiver.deterministic_construction_check_failed.count
(count)
temporal.server.history.archiver.duplicate_archivals.count
(count)
temporal.server.history.archiver.history_mutated.count
(count)
temporal.server.history.archiver.history_size.bucket
(count)
temporal.server.history.archiver.history_size.count
(count)
temporal.server.history.archiver.history_size.sum
(count)

Shown as byte
temporal.server.history.archiver.running_blob_integrity_check.count
(count)
temporal.server.history.archiver.running_deterministic_construction_check.count
(count)
temporal.server.history.archiver.total_upload_size.bucket
(count)
temporal.server.history.archiver.total_upload_size.count
(count)
temporal.server.history.archiver.total_upload_size.sum
(count)

Shown as byte
temporal.server.history.bucket
(count)
temporal.server.history.conflicts.count
(count)
temporal.server.history.count
(count)
temporal.server.history.event_notification.fail_delivery.count
(count)
temporal.server.history.event_notification.fanout_latency.bucket
(count)
temporal.server.history.event_notification.fanout_latency.count
(count)
temporal.server.history.event_notification.fanout_latency.sum
(count)

Shown as millisecond
temporal.server.history.event_notification.inflight_message
(gauge)
temporal.server.history.event_notification.queueing_latency.bucket
(count)
temporal.server.history.event_notification.queueing_latency.count
(count)
temporal.server.history.event_notification.queueing_latency.sum
(count)

Shown as millisecond
temporal.server.history.size.bucket
(count)
temporal.server.history.size.count
(count)
temporal.server.history.size.sum
(count)

Shown as byte
temporal.server.history.sum
(count)
temporal.server.history.workflow_execution_cache_latency.bucket
(count)
temporal.server.history.workflow_execution_cache_latency.count
(count)
temporal.server.history.workflow_execution_cache_latency.sum
(count)

Shown as millisecond
temporal.server.inordered_buffered_events.count
(count)
temporal.server.invalid_task_queue_name.count
(count)
temporal.server.last_processed_message_id
(gauge)
temporal.server.last_retrieved_message_id
(gauge)
temporal.server.lease.failures.count
(count)
temporal.server.lease.requests.count
(count)
temporal.server.list_executions.failures.count
(count)
temporal.server.loaded_task_queue_count
(gauge)
temporal.server.local_to_local.matches.count
(count)
temporal.server.local_to_remote.matches.count
(count)
temporal.server.lock.failures.count
(count)
temporal.server.lock.latency.bucket
(count)
temporal.server.lock.latency.count
(count)
temporal.server.lock.latency.sum
(count)

Shown as millisecond
temporal.server.lock.requests.count
(count)
temporal.server.membership_changed.count
(count)
temporal.server.memo_size.bucket
(count)
temporal.server.memo_size.count
(count)
temporal.server.memo_size.sum
(count)

Shown as byte
temporal.server.modify_workflow_properties_command.count
(count)
temporal.server.multiple_completion_commands.count
(count)
temporal.server.mutable_state.size.bucket
(count)
temporal.server.mutable_state.size.count
(count)
temporal.server.mutable_state.size.sum
(count)

Shown as byte
temporal.server.mutable_state_checksum.invalidated.count
(count)
temporal.server.mutable_state_checksum.mismatch.count
(count)
temporal.server.namespace_cache.callbacks_latency.bucket
(count)
temporal.server.namespace_cache.callbacks_latency.count
(count)
temporal.server.namespace_cache.callbacks_latency.sum
(count)

Shown as millisecond
temporal.server.namespace_cache.prepare_callbacks_latency.bucket
(count)
temporal.server.namespace_cache.prepare_callbacks_latency.count
(count)
temporal.server.namespace_cache.prepare_callbacks_latency.sum
(count)

Shown as millisecond
temporal.server.namespace_dlq.ack_level
(gauge)
temporal.server.namespace_dlq.max_level
(gauge)
temporal.server.namespace_registry.lock_latency.bucket
(count)
temporal.server.namespace_registry.lock_latency.count
(count)
temporal.server.namespace_registry.lock_latency.sum
(count)

Shown as millisecond
temporal.server.namespace_replication.dlq_enqueue_requests.count
(count)
temporal.server.namespace_replication.task_ack_level
(gauge)
temporal.server.new_timer_notifications.count
(count)
temporal.server.no_poller_tasks.count
(count)
Tasks added to a task queue that has no poller.
temporal.server.numshards
(gauge)
temporal.server.parent_close_policy_processor.errors.count
(count)
temporal.server.parent_close_policy_processor.requests.count
(count)
temporal.server.pending_tasks.bucket
(count)
temporal.server.pending_tasks.count
(count)
temporal.server.pending_tasks.sum
(count)
temporal.server.persistence.error_with_type.count
(count)
Number of persistence errors, tagged by error type.
temporal.server.persistence.errors.count
(count)
Number of persistence errors.
temporal.server.persistence.errors.resource_exhausted.count
(count)
temporal.server.persistence.latency.bucket
(count)
Distribution of latencies on persistence operations.
temporal.server.persistence.latency.count
(count)
Count of latencies on persistence operations.
temporal.server.persistence.latency.sum
(count)
Sum of latencies on persistence operations.
Shown as millisecond
temporal.server.persistence.requests.count
(count)
Number of persistence requests.
temporal.server.poll.success.count
(count)
Number of tasks successfully matched by the poller.
temporal.server.poll.success_sync.count
(count)
Number of tasks successfully matched synchronously.
temporal.server.poll.timeouts.count
(count)
Number of times a poller timed out due to no tasks being available.
temporal.server.query_before_first_workflow_task.count
(count)
temporal.server.query_buffer_exceeded.count
(count)
temporal.server.query_registry_invalid_state.count
(count)
temporal.server.queue.actions.count
(count)
temporal.server.queue.latency_schedule.bucket
(count)
temporal.server.queue.latency_schedule.count
(count)
temporal.server.queue.latency_schedule.sum
(count)

Shown as millisecond
temporal.server.queue.reader.bucket
(count)
temporal.server.queue.reader.count
(count)
temporal.server.queue.reader.sum
(count)
temporal.server.queue.slice.bucket
(count)
temporal.server.queue.slice.count
(count)
temporal.server.queue.slice.sum
(count)
temporal.server.rate_limiter.failures.count
(count)
temporal.server.read_namespace.failures.count
(count)
temporal.server.record_marker_command.count
(count)
temporal.server.reject_workflow_update_message.count
(count)
temporal.server.remote_to_local.matches.count
(count)
temporal.server.remote_to_remote.matches.count
(count)
temporal.server.remove_engine_for_shard.latency.bucket
(count)
temporal.server.remove_engine_for_shard.latency.count
(count)
temporal.server.remove_engine_for_shard.latency.sum
(count)

Shown as millisecond
temporal.server.rename_namespace.failures.count
(count)
temporal.server.rename_namespace.success.count
(count)
temporal.server.replication.dlq.ack_level
(gauge)
temporal.server.replication.dlq.enqueue_failed.count
(count)
temporal.server.replication.dlq.max_level
(gauge)
temporal.server.replication.latency.bucket
(count)
temporal.server.replication.latency.count
(count)
temporal.server.replication.latency.sum
(count)

Shown as millisecond
temporal.server.replication.task_cleanup.count
(count)
temporal.server.replication.task_cleanup.failed.count
(count)
temporal.server.replication.tasks.applied.count
(count)
temporal.server.replication.tasks.applied_latency.bucket
(count)
temporal.server.replication.tasks.applied_latency.count
(count)
temporal.server.replication.tasks.applied_latency.sum
(count)

Shown as millisecond
temporal.server.replication.tasks.failed.count
(count)
temporal.server.replication.tasks.fetched.bucket
(count)
temporal.server.replication.tasks.fetched.count
(count)
temporal.server.replication.tasks.fetched.sum
(count)

Shown as millisecond
temporal.server.replication.tasks.lag.bucket
(count)
temporal.server.replication.tasks.lag.count
(count)
temporal.server.replication.tasks.lag.sum
(count)

Shown as millisecond
temporal.server.replication.tasks.returned.bucket
(count)
temporal.server.replication.tasks.returned.count
(count)
temporal.server.replication.tasks.returned.sum
(count)

Shown as millisecond
temporal.server.replication_events_size.bucket
(count)
temporal.server.replication_events_size.count
(count)
temporal.server.replication_events_size.sum
(count)

Shown as millisecond
temporal.server.replicator.dlq_enqueue_fails.count
(count)
temporal.server.replicator.errors.count
(count)
temporal.server.replicator.latency.bucket
(count)
temporal.server.replicator.latency.count
(count)
temporal.server.replicator.latency.sum
(count)

Shown as millisecond
temporal.server.replicator.messages.count
(count)
temporal.server.request_cancel_info.bucket
(count)
temporal.server.request_cancel_info.count
(count)
temporal.server.request_cancel_info.size.bucket
(count)
temporal.server.request_cancel_info.size.count
(count)
temporal.server.request_cancel_info.size.sum
(count)

Shown as byte
temporal.server.request_cancel_info.sum
(count)
temporal.server.respond_query_failed.count
(count)
temporal.server.scan_duration.bucket
(count)
temporal.server.scan_duration.count
(count)
temporal.server.scan_duration.sum
(count)

Shown as millisecond
temporal.server.scavenger.errors.count
(count)
temporal.server.scavenger.skips.count
(count)
temporal.server.scavenger.success.count
(count)
temporal.server.scavenger.validation.failures.count
(count)
temporal.server.scavenger.validation.requests.count
(count)
temporal.server.scavenger.validation.skips.count
(count)
temporal.server.schedule.action.errors.count
(count)
temporal.server.schedule.action.success.count
(count)
temporal.server.schedule.buffer_overruns.count
(count)
temporal.server.schedule.cancel_workflow.errors.count
(count)
temporal.server.schedule.missed_catchup_window.count
(count)
temporal.server.schedule.rate_limited.count
(count)
temporal.server.schedule.terminate_workflow.errors.count
(count)
temporal.server.schedule_activity_command.count
(count)
temporal.server.schedule_to_close.timeout.count
(count)
temporal.server.schedule_to_start.timeout.count
(count)
temporal.server.search_attributes_size.bucket
(count)
temporal.server.search_attributes_size.count
(count)
temporal.server.search_attributes_size.sum
(count)

Shown as byte
temporal.server.service.authorization_latency.bucket
(count)
temporal.server.service.authorization_latency.count
(count)
temporal.server.service.authorization_latency.sum
(count)

Shown as millisecond
temporal.server.service.error_with_type.count
(count)
Errors encountered by the service, tagged by type.
temporal.server.service.errors.count
(count)
temporal.server.service.errors.critical.count
(count)
temporal.server.service.errors.resource_exhausted.count
(count)
temporal.server.service.errors.shard_ownership_lost.count
(count)
temporal.server.service.errors.task_already_started.count
(count)
temporal.server.service.latency.bucket
(count)
Distribution of latencies for all client request operations.
temporal.server.service.latency.count
(count)
Count of latencies for all client request operations.
temporal.server.service.latency.nouserlatency.bucket
(count)
temporal.server.service.latency.nouserlatency.count
(count)
temporal.server.service.latency.nouserlatency.sum
(count)

Shown as millisecond
temporal.server.service.latency.sum
(count)
Sum of latencies for all client requests operations.
Shown as millisecond
temporal.server.service.latency.userlatency.bucket
(count)
temporal.server.service.latency.userlatency.count
(count)
temporal.server.service.latency.userlatency.sum
(count)

Shown as millisecond
temporal.server.service.pending_requests
(gauge)
temporal.server.service.requests.count
(count)
Service requests received per Task Queue.
temporal.server.shard.lock_latency.bucket
(count)
temporal.server.shard.lock_latency.count
(count)
temporal.server.shard.lock_latency.sum
(count)

Shown as millisecond
temporal.server.shard_closed.count
(count)
temporal.server.shard_controller.lock_latency.bucket
(count)
temporal.server.shard_controller.lock_latency.count
(count)
temporal.server.shard_controller.lock_latency.sum
(count)

Shown as millisecond
temporal.server.shardinfo.immediate_queue.lag.bucket
(count)
temporal.server.shardinfo.immediate_queue.lag.count
(count)
temporal.server.shardinfo.immediate_queue.lag.sum
(count)
temporal.server.shardinfo.replication.lag.bucket
(count)
temporal.server.shardinfo.replication.lag.count
(count)
temporal.server.shardinfo.replication.lag.sum
(count)
temporal.server.shardinfo.replication.pending_task.bucket
(count)
temporal.server.shardinfo.replication.pending_task.count
(count)
temporal.server.shardinfo.replication.pending_task.sum
(count)
temporal.server.shardinfo.scheduled_queue.lag.bucket
(count)
temporal.server.shardinfo.scheduled_queue.lag.count
(count)
temporal.server.shardinfo.scheduled_queue.lag.sum
(count)

Shown as millisecond
temporal.server.shardinfo.timer.active.pending_task.bucket
(count)
temporal.server.shardinfo.timer.active.pending_task.count
(count)
temporal.server.shardinfo.timer.active.pending_task.sum
(count)
temporal.server.shardinfo.timer.lag.bucket
(count)
temporal.server.shardinfo.timer.lag.count
(count)
temporal.server.shardinfo.timer.lag.sum
(count)

Shown as millisecond
temporal.server.shardinfo.timer.standby.pending_task.bucket
(count)
temporal.server.shardinfo.timer.standby.pending_task.count
(count)
temporal.server.shardinfo.timer.standby.pending_task.sum
(count)
temporal.server.shardinfo.transfer.active.pending_task.bucket
(count)
temporal.server.shardinfo.transfer.active.pending_task.count
(count)
temporal.server.shardinfo.transfer.active.pending_task.sum
(count)
temporal.server.shardinfo.transfer.lag.bucket
(count)
temporal.server.shardinfo.transfer.lag.count
(count)
temporal.server.shardinfo.transfer.lag.sum
(count)
temporal.server.shardinfo.transfer.standby.pending_task.bucket
(count)
temporal.server.shardinfo.transfer.standby.pending_task.count
(count)
temporal.server.shardinfo.transfer.standby.pending_task.sum
(count)
temporal.server.shardinfo.visibility.lag.bucket
(count)
temporal.server.shardinfo.visibility.lag.count
(count)
temporal.server.shardinfo.visibility.lag.sum
(count)
temporal.server.shardinfo.visibility.pending_task.bucket
(count)
temporal.server.shardinfo.visibility.pending_task.count
(count)
temporal.server.shardinfo.visibility.pending_task.sum
(count)
temporal.server.sharditem.acquisition_latency.bucket
(count)
temporal.server.sharditem.acquisition_latency.count
(count)
temporal.server.sharditem.acquisition_latency.sum
(count)

Shown as millisecond
temporal.server.sharditem.created.count
(count)
temporal.server.sharditem.removed.count
(count)
temporal.server.signal_external_workflow_command.count
(count)
temporal.server.signal_info.bucket
(count)
temporal.server.signal_info.count
(count)
temporal.server.signal_info.size.bucket
(count)
temporal.server.signal_info.size.count
(count)
temporal.server.signal_info.size.sum
(count)

Shown as byte
temporal.server.signal_info.sum
(count)
temporal.server.stale_mutable_state.count
(count)
temporal.server.stale_replication_events.count
(count)
temporal.server.start_timer_command.count
(count)
temporal.server.start_to_close.timeout.count
(count)
temporal.server.started.count
(count)
temporal.server.state_transition.bucket
(count)
temporal.server.state_transition.count
(count)
temporal.server.state_transition.sum
(count)
temporal.server.stopped.count
(count)
temporal.server.sync_throttle.count
(count)
temporal.server.syncmatch.latency.bucket
(count)
temporal.server.syncmatch.latency.count
(count)
temporal.server.syncmatch.latency.sum
(count)

Shown as millisecond
temporal.server.syncshard.remote.count
(count)
temporal.server.syncshard.remote.failed.count
(count)
temporal.server.task.attempt.bucket
(count)
Distribution of the number of attempts on each task execution.
temporal.server.task.attempt.count
(count)
Count of the number of attempts on each task execution.
temporal.server.task.attempt.sum
(count)
Sum of the number of attempts on each task execution.
temporal.server.task.batch_complete_counter.count
(count)
temporal.server.task.bucket
(count)
temporal.server.task.count
(count)
temporal.server.task.deleted
(gauge)
temporal.server.task.dependency_task_not_completed.count
(count)
temporal.server.task.errors.corruption.count
(count)
temporal.server.task.errors.count
(count)
Number of task process errors.
temporal.server.task.errors.discarded.count
(count)
temporal.server.task.errors.limit_exceeded_counter.count
(count)
temporal.server.task.errors.namespace_handover.count
(count)
temporal.server.task.errors.not_active_counter.count
(count)
temporal.server.task.errors.standby_retry_counter.count
(count)
temporal.server.task.errors.throttled.count
(count)
temporal.server.task.errors.version_mismatch.count
(count)
temporal.server.task.errors.workflow_busy.count
(count)
temporal.server.task.lag_per_tl
(gauge)
temporal.server.task.latency.bucket
(count)
Distribution of in-memory latencies across multiple attempts.
temporal.server.task.latency.count
(count)
Count of in-memory latencies across multiple attempts.
temporal.server.task.latency.load.bucket
(count)
Distribution of durations from task generation to task loading.
temporal.server.task.latency.load.count
(count)
Count of durations from task generation to task loading.
temporal.server.task.latency.load.sum
(count)
Sum of durations from task generation to task loading.
Shown as millisecond
temporal.server.task.latency.processing.bucket
(count)
Distribution of task processing latencies per attempt.
temporal.server.task.latency.processing.count
(count)
Count of task processing latencies per attempt.
temporal.server.task.latency.processing.sum
(count)
Sum of task processing latencies per attempt.
Shown as millisecond
temporal.server.task.latency.queue.bucket
(count)
Distribution of durations from when a task is fired to when the task is done.
temporal.server.task.latency.queue.count
(count)
Count of durations from when a task is fired to when the task is done.
temporal.server.task.latency.queue.sum
(count)
Sum of durations from when a task is fired to when the task is done.
Shown as millisecond
temporal.server.task.latency.schedule.bucket
(count)
Distribution of durations from task submission to processing.
temporal.server.task.latency.schedule.count
(count)
Count of durations from task submission to processing.
temporal.server.task.latency.schedule.sum
(count)
Sum of durations from task submission to processing.
Shown as millisecond
temporal.server.task.latency.sum
(count)
Sum of in-memory latencies across multiple attempts.
Shown as millisecond
temporal.server.task.latency.user.bucket
(count)
temporal.server.task.latency.user.count
(count)
temporal.server.task.latency.user.sum
(count)

Shown as millisecond
temporal.server.task.processed
(gauge)
temporal.server.task.requests.count
(count)
Number of task process requests.
temporal.server.task.schedule_to_start_latency.bucket
(count)
temporal.server.task.schedule_to_start_latency.count
(count)
temporal.server.task.schedule_to_start_latency.sum
(count)

Shown as millisecond
temporal.server.task.skipped.count
(count)
temporal.server.task.sum
(count)
temporal.server.task.write.latency.bucket
(count)
temporal.server.task.write.latency.count
(count)
temporal.server.task.write.latency.sum
(count)

Shown as millisecond
temporal.server.task.write.throttle.count
(count)
temporal.server.task_queue.started.count
(count)
temporal.server.task_queue.stopped.count
(count)
temporal.server.task_rescheduler.pending_tasks.bucket
(count)
temporal.server.task_rescheduler.pending_tasks.count
(count)
temporal.server.task_rescheduler.pending_tasks.sum
(count)
temporal.server.task_scheduler.throttled.count
(count)
temporal.server.taskqueue.deleted
(gauge)
temporal.server.taskqueue.outstanding
(gauge)
temporal.server.taskqueue.processed
(gauge)
temporal.server.tasks_expired.count
(count)
temporal.server.timer_info.bucket
(count)
temporal.server.timer_info.count
(count)
temporal.server.timer_info.size.bucket
(count)
temporal.server.timer_info.size.count
(count)
temporal.server.timer_info.size.sum
(count)

Shown as byte
temporal.server.timer_info.sum
(count)
temporal.server.transfer_task.missing_event_counter.count
(count)
temporal.server.unbuffer_replication_tasks.bucket
(count)
temporal.server.unbuffer_replication_tasks.count
(count)
temporal.server.unbuffer_replication_tasks.sum
(count)

Shown as millisecond
temporal.server.update_namespace.failures.count
(count)
temporal.server.upsert_workflow_search_attributes_command.count
(count)
temporal.server.version_check.failed.count
(count)
temporal.server.version_check.latency.bucket
(count)
temporal.server.version_check.latency.count
(count)
temporal.server.version_check.latency.sum
(count)

Shown as millisecond
temporal.server.version_check.request_failed.count
(count)
temporal.server.version_check.success.count
(count)
temporal.server.visibility.archiver.archive.non_retryable_error.count
(count)
temporal.server.visibility.archiver.archive.success.count
(count)
temporal.server.visibility.archiver.archive.transient_error.count
(count)
temporal.server.visibility.persistence.error_with_type.count
(count)
temporal.server.visibility.persistence.errors.count
(count)
temporal.server.visibility.persistence.latency.bucket
(count)
temporal.server.visibility.persistence.latency.count
(count)
temporal.server.visibility.persistence.latency.sum
(count)

Shown as millisecond
temporal.server.visibility.persistence.requests.count
(count)
temporal.server.visibility.persistence.resource_exhausted.count
(count)
temporal.server.wf_too_many_pending.activities.count
(count)
temporal.server.wf_too_many_pending.cancel_requests.count
(count)
temporal.server.wf_too_many_pending.child_workflows.count
(count)
temporal.server.wf_too_many_pending.external_workflow_signals.count
(count)
temporal.server.worker_not_supports_consistent_query.count
(count)
temporal.server.workflow.cancel.count
(count)
temporal.server.workflow.cleanup.archive.count
(count)
temporal.server.workflow.cleanup.delete.count
(count)
temporal.server.workflow.cleanup.delete_history_inline.count
(count)
temporal.server.workflow.cleanup.nop.count
(count)
temporal.server.workflow.context_cleared.count
(count)
temporal.server.workflow.continued_as_new.count
(count)
temporal.server.workflow.cron_backoff.timer.count
(count)
temporal.server.workflow.eager_execution.count
(count)
temporal.server.workflow.eager_execution.denied.count
(count)
temporal.server.workflow.failed.count
(count)
temporal.server.workflow.retry_backoff.timer.count
(count)
temporal.server.workflow.run_timeout_overrides.count
(count)
temporal.server.workflow.success.count
(count)
temporal.server.workflow.task.attempt.bucket
(count)
temporal.server.workflow.task.attempt.count
(count)
temporal.server.workflow.task.attempt.sum
(count)
temporal.server.workflow.task.heartbeat_timeout.count
(count)
temporal.server.workflow.task.query_latency.bucket
(count)
temporal.server.workflow.task.query_latency.count
(count)
temporal.server.workflow.task.query_latency.sum
(count)

Shown as millisecond
temporal.server.workflow.task.timeout_overrides.count
(count)
temporal.server.workflow.terminate.count
(count)
temporal.server.workflow.timeout.count
(count)

Eventos

La integración Temporal no incluye eventos.

Checks de servicio

temporal.server.openmetrics.health

Returns CRITICAL if the check cannot access the Prometheus metrics endpoint of the Temporal instance.

Statuses: ok, critical

Logs

La integración Temporal puede recopilar logs del clúster de Temporal y reenviarlos a Datadog.

Solucionar problemas

¿Necesitas ayuda? Ponte en contacto con soporte técnico de Datadog.

Referencias adicionales

Documentación útil adicional, enlaces y artículos: