Gitlab
Datadog's Research Report: The State of Serverless Report: The State of Serverless

Gitlab

Agent Check Agent Check

Supported OS: Linux Mac OS Windows

Overview

Integration that allows to:

  • Visualize and monitor metrics collected via Gitlab through Prometheus

See the Gitlab documentation for more information about Gitlab and its integration with Prometheus.

Setup

Installation

The Gitlab check is included in the Datadog Agent package, so you don’t need to install anything else on your Gitlab servers.

Configuration

Host

Follow the instructions below to configure this check for an Agent running on a host. For containerized environments, see the Containerized section.

Metric collection
  1. Edit the gitlab.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory, to point to the Gitlab’s metrics endpoint. See the sample gitlab.d/conf.yaml for all available configuration options.

    Note: The metrics in metrics.py are collected by default. The allowed_metrics configuration option in the init_config collects specific legacy metrics. Some metrics may not be collected depending on your Gitlab instance version and configuration. See Gitlab’s documentation for further information about its metric collection.

  2. Allow access to monitoring endpoints by updating your /etc/gitlab/gitlab.rb to include the following line:

    gitlab_rails['monitoring_whitelist'] = ['127.0.0.0/8', '192.168.0.1']

    Note Save and reconfigure Gitlab to see the changes.

  3. Restart the Agent

Log collection
  1. Collecting logs is disabled by default in the Datadog Agent, enable it in your datadog.yaml file:

    logs_enabled: true
  2. Next, edit gitlab.d/conf.yaml by uncommenting the logs lines at the bottom. Update the logs path with the correct path to your Gitlab log files.

     logs:
       - type: file
         path: /var/log/gitlab/gitlab-rails/production_json.log
         service: '<SERVICE_NAME>'
         source: gitlab
       - type: file
         path: /var/log/gitlab/gitlab-rails/production.log
         service: '<SERVICE_NAME>'
         source: gitlab
       - type: file
         path: /var/log/gitlab/gitlab-rails/api_json.log
         service: '<SERVICE_NAME>'
         source: gitlab
  3. Restart the Agent.

Containerized

For containerized environments, see the Autodiscovery Integration Templates for guidance on applying the parameters below.

Metric collection
ParameterValue
<INTEGRATION_NAME>gitlab
<INIT_CONFIG>blank or {}
<INSTANCE_CONFIG>{"gitlab_url":"http://%%host%%/", "prometheus_endpoint":"http://%%host%%:10055/-/metrics"}
Log collection

Collecting logs is disabled by default in the Datadog Agent. To enable it, see Docker log collection.

ParameterValue
<LOG_CONFIG>{"source": "gitlab", "service": "gitlab"}

Validation

Run the Agent’s status subcommand and look for gitlab under the Checks section.

Data Collected

Metrics

gitlab.banzai.cached_render_real_duration_seconds.count
(count)
Count of duration of rendering Markdown into HTML when cached output exists
Shown as second
gitlab.banzai.cached_render_real_duration_seconds.sum
(gauge)
Sum of duration of rendering Markdown into HTML when cached output exists
Shown as second
gitlab.banzai.cacheless_render_real_duration_seconds.count
(count)
Count of duration of rendering Markdown into HTML when cached output does not exist
Shown as second
gitlab.banzai.cacheless_render_real_duration_seconds.sum
(gauge)
Sum of duration of rendering Markdown into HTML when cached output does not exist
Shown as second
gitlab.cache.misses_total
(count)
Cache read miss count
Shown as second
gitlab.cache.operation_duration_seconds_count
(count)
Count of cache access time
Shown as second
gitlab.cache.operation_duration_seconds_sum
(gauge)
Sum of cache access time
Shown as second
gitlab.cache.operations_total
(count)
Count of cache operations by controller/action
gitlab.job.waiter_started_total
(count)
Number of batches of jobs started where a web request is waiting for the jobs to complete
Shown as job
gitlab.job.waiter_timeouts_total
(count)
Number of batches of jobs that timed out where a web request is waiting for the jobs to complete
Shown as job
gitlab.database.transaction_seconds.count
(count)
Count of cache operations by controller/action
Shown as second
gitlab.database_transaction_seconds.sum
(gauge)
Sum of cache operations by controller/action
Shown as second
gitlab.method_call_duration_seconds.count
(count)
Count of method calls real duration
Shown as second
gitlab.method_call_duration_seconds.sum
(gauge)
Sum of method calls real duration
Shown as second
gitlab.page_out_of_bounds
(count)
Counter for the PageLimiter pagination limit being hit
gitlab.rails_queue_duration_seconds.count
(count)
Counter for latency between GitLab Workhorse forwarding a request to Rails
Shown as second
gitlab.rails_queue_duration_seconds.sum
(gauge)
Sum of latency between GitLab Workhorse forwarding a request to Rails
Shown as second
gitlab.sql_duration_seconds.count
(count)
Total SQL execution time, excluding SCHEMA operations and BEGIN / COMMIT
Shown as second
gitlab.sql_duration_seconds.sum
(gauge)
Sum of SQL execution time, excluding SCHEMA operations and BEGIN / COMMIT
Shown as second
gitlab.transaction.allocated_memory_bytes.count
(count)
Count of allocated memory for all transactions (gitlab_transaction_* metrics)
Shown as byte
gitlab.transaction.allocated_memory_bytes.sum
(gauge)
Sum of allocated memory for all transactions (gitlab_transaction_* metrics)
Shown as byte
gitlab.transaction.cache_count_total
(count)
Counter for total Rails cache calls (aggregate)
gitlab.transaction.cache_duration_total
(count)
Counter for total time (seconds) spent in Rails cache calls (aggregate)
Shown as second
gitlab.transaction.cache_read_hit_count_total
(count)
Counter for cache hits for Rails cache calls
Shown as hit
gitlab.transaction.cache_read_miss_count_total
(count)
Counter for cache misses for Rails cache calls
Shown as miss
gitlab.transaction.duration_seconds.count
(count)
Count of Duration for all transactions (gitlab_transaction_* metrics)
Shown as second
gitlab.transaction.duration_seconds.sum
(gauge)
Sum of Duration for all transactions (gitlab_transaction_* metrics)
Shown as second
gitlab.transaction.event_build_found_total
(count)
Counter for build found for API /jobs/request
gitlab.transaction.event_build_invalid_total
(count)
Counter for build invalid due to concurrency conflict for API /jobs/request
gitlab.transaction.event_build_not_found_cached_total
(count)
Counter for build invalid due to concurrency conflict for API /jobs/request
gitlab.transaction.event_build_not_found_total
(count)
Counter for build not found for API /jobs/request
gitlab.transaction.event_change_default_branch_total
(count)
Counter when default branch is changed for any repository
gitlab.transaction.event_create_repository_total
(count)
Counter when any repository is created
gitlab.transaction.event_etag_caching_cache_hit_total
(count)
Counter for etag cache hit.
Shown as hit
gitlab.transaction.event_etag_caching_header_missing_total
(count)
Counter for etag cache miss - header missing
Shown as miss
gitlab.transaction.event_etag_caching_key_not_found_total
(count)
Counter for etag cache miss - key not found
Shown as miss
gitlab.transaction.event_etag_caching_middleware_used_total
(count)
Counter for etag middleware accessed
gitlab.transaction.event_etag_caching_resource_changed_total
(count)
Counter for etag cache miss - resource changed
gitlab.transaction.event_fork_repository_total
(count)
Counter for repository forks (RepositoryForkWorker). Only incremented when source repository exists
gitlab.transaction.event_import_repository_total
(count)
Counter for repository imports (RepositoryImportWorker)
gitlab.transaction.event_push_branch_total
(count)
Counter for all branch pushes
gitlab.transaction.event_push_commit_total
(count)
Counter for commits
gitlab.transaction.event_push_tag_total
(count)
Counter for tag pushes
gitlab.transaction.event_rails_exception_total
(count)
Counter for number of rails exceptions
gitlab.transaction.event_receive_email_total
(count)
Counter for received emails
Shown as email
gitlab.transaction.event_remote_mirrors_failed_total
(count)
Counter for failed remote mirrors
gitlab.transaction.event_remote_mirrors_finished_total
(count)
Counter for finished remote mirrors
gitlab.transaction.event_remote_mirrors_running_total
(count)
Counter for running remote mirrors
gitlab.transaction.event_remove_branch_total
(count)
Counter when a branch is removed for any repository
gitlab.transaction.event_remove_repository_total
(count)
Counter when a repository is removed
gitlab.transaction.event_remove_tag_total
(count)
Counter when a tag is remove for any repository
gitlab.transaction.event_sidekiq_exception_total
(count)
Counter of Sidekiq exceptions
gitlab.transaction.event_stuck_import_jobs_total
(count)
Count of stuck import jobs
gitlab.transaction.event_update_build_total
(count)
Counter for update build for API /jobs/request/:id
gitlab.transaction.new_redis_connections_total
(count)
Counter for new Redis connections
Shown as connection
gitlab.transaction.queue_duration_total
(count)
Duration jobs were enqueued before processing
gitlab.transaction.rails_queue_duration_total
(count)
Measures latency between GitLab Workhorse forwarding a request to Rails
gitlab.transaction.view_duration_total
(count)
Duration for views
gitlab.view_rendering_duration_seconds.count
(count)
Count of duration for views (histogram)
Shown as second
gitlab.view_rendering_duration_seconds.sum
(count)
Sum of duration for views (histogram)
Shown as second
gitlab.rack.http_requests_total.count
(count)
Rack request count
Shown as request
gitlab.rack.http_requests_total.sum
(gauge)
Sum of rack requests
Shown as request
gitlab.rack.http_request_duration_seconds.sum
(gauge)
Sum of HTTP response time from rack middleware
Shown as second
gitlab.rack.http_request_duration_seconds.count
(count)
Count of HTTP response time from rack middleware
Shown as second
gitlab.rack.uncaught_errors_total
(count)
Rack connections handling uncaught errors count
Shown as connection
gitlab.pipelines_created_total
(count)
Counter of pipelines created
gitlab.user_session_logins_total
(count)
Counter of how many users have logged in
gitlab.upload_file_does_not_exist
(count)
Number of times an upload record could not find its file
gitlab.failed_login_captcha_total
(gauge)
Counter of failed CAPTCHA attempts during login
gitlab.successful_login_captcha_total
(gauge)
Counter of successful CAPTCHA attempts during login
gitlab.auto_devops_pipelines_completed_total
(count)
Counter of completed Auto DevOps pipelines, labeled by status
gitlab.sidekiq.jobs_cpu_seconds.count
(count)
Count of seconds of cpu time to run Sidekiq job
Shown as second
gitlab.sidekiq.jobs_cpu_seconds.sum
(gauge)
Sum of seconds of cpu time to run Sidekiq job
Shown as second
gitlab.sidekiq.jobs_completion_seconds.count
(count)
Count of seconds to complete Sidekiq job
Shown as second
gitlab.sidekiq.jobs_db_second.count
(count)
Count of seconds of DB time to run Sidekiq job
Shown as second
gitlab.sidekiq.jobs_db_second.sum
(gauge)
Sum of seconds of DB time to run Sidekiq job
Shown as second
gitlab.sidekiq.jobs_gitaly_seconds.count
(count)
Count of seconds of Gitaly time to run Sidekiq job
Shown as second
gitlab.sidekiq.jobs_gitaly_seconds.sum
(gauge)
Sum of Seconds of Gitaly time to run Sidekiq job
Shown as second
gitlab.sidekiq.jobs_completion_seconds.sum
(gauge)
Sum of seconds to complete Sidekiq job
Shown as second
gitlab.sidekiq.jobs_queue_duration_seconds.count
(count)
Count of duration in seconds that a Sidekiq job was queued before being executed
Shown as second
gitlab.sidekiq.jobs_queue_duration_seconds.sum
(gauge)
Sum of duration in seconds that a Sidekiq job was queued before being executed
Shown as second
gitlab.sidekiq.jobs_failed_total
(count)
Sidekiq jobs failed
Shown as job
gitlab.sidekiq.jobs_retried_total
(count)
Sidekiq jobs retired
Shown as job
gitlab.sidekiq.running_jobs
(gauge)
Number of Sidekiq jobs running
Shown as job
gitlab.sidekiq.concurrency
(gauge)
Maximum number of Sidekiq jobs
Shown as job
gitlab.ruby.gc_duration_seconds
(gauge)
Time spent by Ruby in GC
Shown as second
gitlab.ruby.file_descriptors
(gauge)
File descriptors per process
gitlab.ruby.memory_bytes
(gauge)
Memory usage by process
Shown as byte
gitlab.ruby.sampler_duration_seconds_total
(count)
Time spent collecting stats
Shown as second
gitlab.ruby.process_cpu_seconds_total
(gauge)
Total amount of CPU time per process
Shown as second
gitlab.ruby.process_max_fds
(gauge)
Maximum number of open file descriptors per process
gitlab.ruby.process_resident_memory_bytes
(gauge)
Memory usage by process
Shown as byte
gitlab.ruby.process_start_time_seconds
(gauge)
UNIX timestamp of process start time
Shown as second
gitlab.ruby.gc_stat.count
(gauge)
Number of ruby garbage collectors
gitlab.ruby.gc_stat.heap_allocated_pages
(gauge)
Number of currently allocated heap pages
Shown as page
gitlab.ruby.gc_stat.heap_sorted_length
(gauge)
Length of the heap in memory
gitlab.ruby.gc_stat.heap_allocatable_pages
(gauge)
Number malloced pages that can be used
Shown as page
gitlab.ruby.gc_stat.heap_available_slots
(gauge)
Number of slots in heap pages
gitlab.ruby.gc_stat.heap_live_slots
(gauge)
Number of live slots in heap
gitlab.ruby.gc_stat.heap_free_slots
(gauge)
Number of empty slots in heap
gitlab.ruby.gc_stat.heap_final_slots
(gauge)
Number of slots in heap with finalizers
gitlab.ruby.gc_stat.heap_marked_slots
(gauge)
Number of slots that are marked, or old
Shown as page
gitlab.ruby.gc_stat.heap_eden_pages
(gauge)
Number of heap pages that contain a live object
Shown as page
gitlab.ruby.gc_stat.heap_tomb_pages
(gauge)
Number of heap pages that do not contain a live object
Shown as page
gitlab.ruby.gc_stat.total_allocated_pages
(gauge)
Number of pages allocated
Shown as page
gitlab.ruby.gc_stat.total_freed_pages
(gauge)
Number of pages freed
Shown as page
gitlab.ruby.gc_stat.total_allocated_objects
(gauge)
Number of allocated objects
gitlab.ruby.gc_stat.total_freed_objects
(gauge)
Number of freed objects
gitlab.ruby.gc_stat.malloc_increase_bytes
(gauge)
Number of bytes allocated outside of the heap
Shown as byte
gitlab.ruby.gc_stat.malloc_increase_bytes_limit
(gauge)
The limit to how many bytes can be allocated outside of the heap
Shown as byte
gitlab.ruby.gc_stat.minor_gc_count
(gauge)
Number of minor garbage collectors
Shown as garbage collection
gitlab.ruby.gc_stat.major_gc_count
(gauge)
Number of major garbage collectors
Shown as garbage collection
gitlab.ruby.gc_stat.remembered_wb_unprotected_objects
(gauge)
Number of old objects that reference new objects
gitlab.ruby.gc_stat.remembered_wb_unprotected_objects_limit
(gauge)
The limit of wb ubprotected objects
gitlab.ruby.gc_stat.old_objects
(gauge)
The number of old objects
gitlab.ruby.gc_stat.old_objects_limit
(gauge)
The limit of number of old objects
gitlab.ruby.gc_stat.oldmalloc_increase_bytes
(gauge)
Number of bytes allocated outside of the heap for old objects
Shown as byte
gitlab.ruby.gc_stat.oldmalloc_increase_bytes_limit
(gauge)
The limit of how many bytes can be allocated outside of the heap for old objects
Shown as byte
gitlab.geo.db_replication_lag_seconds
(gauge)
Database replication lag (seconds)
Shown as second
gitlab.geo.repositories
(gauge)
Total number of repositories available on primary
gitlab.geo.repositories_synced
(gauge)
Number of repositories synced on secondary
gitlab.geo.repositories_failed
(gauge)
Number of repositories failed to sync on secondary
gitlab.geo.lfs_objects
(gauge)
Total number of LFS objects available on primary
gitlab.geo.lfs_objects_synced
(gauge)
Number of LFS objects synced on secondary
gitlab.geo.lfs_objects_failed
(gauge)
Number of LFS objects failed to sync on secondary
gitlab.geo.attachments
(gauge)
Total number of file attachments available on primary
gitlab.geo.attachments_synced
(gauge)
Number of attachments synced on secondary
gitlab.geo.attachments_failed
(gauge)
Number of attachments failed to sync on secondary
gitlab.geo.last_event_id
(gauge)
Database ID of the latest event log entry on the primary
gitlab.geo.last_event_timestamp
(gauge)
UNIX timestamp of the latest event log entry on the primary
gitlab.geo.cursor_last_event_id
(gauge)
Last database ID of the event log processed by the secondary
gitlab.geo.cursor_last_event_timestamp
(gauge)
Last UNIX timestamp of the event log processed by the secondary
gitlab.geo.status_failed_total
(count)
Number of times retrieving the status from the Geo Node failed
gitlab.geo.last_successful_status_check_timestamp
(gauge)
Last timestamp when the status was successfully updated
gitlab.geo.lfs_objects_synced_missing_on_primary
(gauge)
Number of LFS objects marked as synced due to the file missing on the primary
gitlab.geo.job_artifacts_synced_missing_on_primary
(gauge)
Number of job artifacts marked as synced due to the file missing on the primary
gitlab.geo.attachments_synced_missing_on_primary
(gauge)
Number of attachments marked as synced due to the file missing on the primary
gitlab.geo.repositories_checksummed_count
(gauge)
Number of repositories checksummed on primary
gitlab.geo.repositories_checksum_failed_count
(gauge)
Number of repositories failed to calculate the checksum on primary
gitlab.geo.wikis_checksummed_count
(gauge)
Number of wikis checksummed on primary
gitlab.geo.wikis_checksum_failed_count
(gauge)
Number of wikis failed to calculate the checksum on primary
gitlab.geo.repositories_verified_count
(gauge)
Number of repositories verified on secondary
gitlab.geo.repositories_verification_failed_count
(gauge)
Number of repositories failed to verify on secondary
gitlab.geo.repositories_checksum_mismatch_count
(gauge)
Number of repositories that checksum mismatch on secondary
gitlab.geo.wikis_verified_count
(gauge)
Number of wikis verified on secondary
gitlab.geo.wikis_verification_failed_count
(gauge)
Number of wikis failed to verify on secondary
gitlab.geo.wikis_checksum_mismatch_count
(gauge)
Number of wikis that checksum mismatch on secondary
gitlab.geo.repositories_checked_count
(gauge)
Number of repositories that have been checked via git fsck
gitlab.geo.repositories_checked_failed_count
(gauge)
Number of repositories that have a failure from git fsck
gitlab.geo.repositories_retrying_verification_count
(gauge)
Number of repositories verification failures that Geo is actively trying to correct on secondary
gitlab.geo.wikis_retrying_verification_count
(gauge)
Number of wikis verification failures that Geo is actively trying to correct on secondary
gitlab.db_load_balancing_hosts
(gauge)
Current number of load balancing hosts
Shown as host
gitlab.unicorn.active_connections
(gauge)
The number of active Unicorn connections (workers)
Shown as connection
gitlab.unicorn.queued_connections
(gauge)
The number of queued Unicorn connections
Shown as connection
gitlab.unicorn.workers
(gauge)
The number of Unicorn workers
Shown as worker
gitlab.puma.workers
(gauge)
Total number of puma workers
Shown as worker
gitlab.puma.running_workers
(gauge)
Number of booted puma workers
Shown as worker
gitlab.puma.stale_workers
(gauge)
Number of old puma workers
Shown as worker
gitlab.puma.running
(gauge)
Number of running puma threads
Shown as thread
gitlab.puma.queued_connections
(gauge)
Number of connections in that puma worker’s “todo” set waiting for a worker thread
Shown as connection
gitlab.puma.active_connections
(gauge)
Number of puma threads processing a request
Shown as thread
gitlab.puma.pool_capacity
(gauge)
Number of requests the puma worker is capable of taking right now
Shown as request
gitlab.puma.max_threads
(gauge)
Maximum number of puma worker threads
Shown as thread
gitlab.puma.idle_threads
(gauge)
Number of spawned puma threads which are not processing a request
Shown as thread
gitlab.puma.killer_terminations_total
(gauge)
Number of workers terminated by PumaWorkerKiller
Shown as worker
gitlab.go_gc_duration_seconds
(gauge)
A summary of the GC invocation durations
Shown as request
gitlab.go_gc_duration_seconds_sum
(gauge)
Sum of the GC invocation durations
Shown as request
gitlab.go_gc_duration_seconds_count
(gauge)
Count of the GC invocation durations
Shown as request
gitlab.go_goroutines
(gauge)
Number of goroutines that currently exist
Shown as request
gitlab.go_memstats_alloc_bytes
(gauge)
Number of bytes allocated and still in use
Shown as byte
gitlab.go_memstats_alloc_bytes_total
(count)
Total number of bytes allocated
Shown as byte
gitlab.go_memstats_buck_hash_sys_bytes
(gauge)
Number of bytes used by the profiling bucket hash table
Shown as byte
gitlab.go_memstats_frees_total
(count)
Total number of frees
Shown as request
gitlab.go_memstats_gc_cpu_fraction
(gauge)
The fraction of this program's available CPU time used by the GC since the program started
Shown as request
gitlab.go_memstats_gc_sys_bytes
(gauge)
Number of bytes used for garbage collection system metadata
Shown as byte
gitlab.go_memstats_heap_alloc_bytes
(gauge)
Number of heap bytes allocated and still in use
Shown as byte
gitlab.go_memstats_heap_idle_bytes
(gauge)
Number of heap bytes waiting to be used
Shown as byte
gitlab.go_memstats_heap_inuse_bytes
(gauge)
Number of heap bytes that are in use
Shown as byte
gitlab.go_memstats_heap_objects
(gauge)
Number of allocated objects
Shown as request
gitlab.go_memstats_heap_released_bytes_total
(count)
Total number of heap bytes released to OS
Shown as byte
gitlab.go_memstats_heap_sys_bytes
(gauge)
Number of heap bytes obtained from system
Shown as byte
gitlab.go_memstats_last_gc_time_seconds
(gauge)
Number of seconds since 1970 of last garbage collection
Shown as request
gitlab.go_memstats_lookups_total
(count)
Total number of pointer lookups
Shown as request
gitlab.go_memstats_mallocs_total
(count)
Total number of mallocs
Shown as request
gitlab.go_memstats_mcache_inuse_bytes
(gauge)
Number of bytes in use by mcache structures
Shown as byte
gitlab.go_memstats_mcache_sys_bytes
(gauge)
Number of bytes used for mcache structures obtained from system
Shown as byte
gitlab.go_memstats_mspan_inuse_bytes
(gauge)
Number of bytes in use by mspan structures
Shown as byte
gitlab.go_memstats_mspan_sys_bytes
(gauge)
Number of bytes used for mspan structures obtained from system
Shown as byte
gitlab.go_memstats_next_gc_bytes
(gauge)
Number of heap bytes when next garbage collection will take place
Shown as byte
gitlab.go_memstats_other_sys_bytes
(gauge)
Number of bytes used for other system allocations
Shown as byte
gitlab.go_memstats_stack_inuse_bytes
(gauge)
Number of bytes in use by the stack allocator
Shown as byte
gitlab.go_memstats_stack_sys_bytes
(gauge)
Number of bytes obtained from system for stack allocator
Shown as byte
gitlab.go_memstats_sys_bytes
(gauge)
Number of bytes obtained by system. Sum of all system allocations
Shown as byte
gitlab.go_threads
(gauge)
Number of OS threads create
Shown as request
gitlab.http_request_duration_microseconds
(gauge)
The HTTP request latencies in microseconds
Shown as request
gitlab.http_request_size_bytes
(gauge)
The HTTP request sizes in bytes
Shown as byte
gitlab.http_requests_total
(count)
Total number of HTTP requests made
Shown as request
gitlab.http_response_size_bytes
(gauge)
The HTTP response sizes in bytes
Shown as byte
gitlab.process_cpu_seconds_total
(count)
Total user and system CPU time spent in seconds
Shown as request
gitlab.process_max_fds
(gauge)
Maximum number of open file descriptors
Shown as request
gitlab.process_open_fds
(gauge)
Number of open file descriptors
Shown as request
gitlab.process_resident_memory_bytes
(gauge)
Resident memory size in bytes
Shown as byte
gitlab.process_start_time_seconds
(gauge)
Start time of the process since unix epoch in seconds
Shown as request
gitlab.process_virtual_memory_bytes
(gauge)
Virtual memory size in bytes
Shown as byte
gitlab.prometheus_build_info
(gauge)
A metric with a constant '1' value labeled by version revision branch and goversion from which prometheus was built
Shown as request
gitlab.prometheus_config_last_reload_success_timestamp_seconds
(gauge)
Timestamp of the last successful configuration reload
Shown as request
gitlab.prometheus_config_last_reload_successful
(gauge)
Whether the last configuration reload attempt was successful
Shown as request
gitlab.prometheus_engine_queries
(gauge)
The current number of queries being executed or waiting
Shown as request
gitlab.prometheus_engine_queries_concurrent_max
(gauge)
The max number of concurrent queries
Shown as request
gitlab.prometheus_engine_query_duration_seconds
(gauge)
Query timing
Shown as request
gitlab.prometheus_evaluator_duration_seconds
(gauge)
The duration of rule group evaluations
Shown as request
gitlab.prometheus_evaluator_iterations_missed_total
(count)
The total number of rule group evaluations missed due to slow rule group evaluation
Shown as request
gitlab.prometheus_evaluator_iterations_skipped_total
(count)
The total number of rule group evaluations skipped due to throttled metric storage
Shown as request
gitlab.prometheus_evaluator_iterations_total
(count)
The total number of scheduled rule group evaluations whether executed missed or skipped
Shown as request
gitlab.prometheus_local_storage_checkpoint_duration_seconds
(gauge)
The duration in seconds taken for checkpointing open chunks and chunks yet to be persisted
Shown as request
gitlab.prometheus_local_storage_checkpoint_last_duration_seconds
(gauge)
The duration in seconds it took to last checkpoint open chunks and chunks yet to be persisted
Shown as request
gitlab.prometheus_local_storage_checkpoint_last_size_bytes
(gauge)
The size of the last checkpoint of open chunks and chunks yet to be persisted
Shown as byte
gitlab.prometheus_local_storage_checkpoint_series_chunks_written
(gauge)
The number of chunk written per series while checkpointing open chunks and chunks yet to be persisted
Shown as request
gitlab.prometheus_local_storage_checkpointing
(gauge)
1 if the storage is checkpointing and 0 otherwise
Shown as request
gitlab.prometheus_local_storage_chunk_ops_total
(count)
The total number of chunk operations by their type
Shown as request
gitlab.prometheus_local_storage_chunks_to_persist
(count)
The current number of chunks waiting for persistence
Shown as request
gitlab.prometheus_local_storage_fingerprint_mappings_total
(count)
The total number of fingerprints being mapped to avoid collisions
Shown as request
gitlab.prometheus_local_storage_inconsistencies_total
(count)
A counter incremented each time an inconsistency in the local storage is detected. If this is greater zero then restart the server as soon as possible
Shown as request
gitlab.prometheus_local_storage_indexing_batch_duration_seconds
(gauge)
Quantiles for batch indexing duration in seconds
Shown as request
gitlab.prometheus_local_storage_indexing_batch_sizes
(gauge)
Quantiles for indexing batch sizes (number of metrics per batch)
Shown as request
gitlab.prometheus_local_storage_indexing_queue_capacity
(gauge)
The capacity of the indexing queue
Shown as request
gitlab.prometheus_local_storage_indexing_queue_length
(gauge)
The number of metrics waiting to be indexed
Shown as request
gitlab.prometheus_local_storage_ingested_samples_total
(count)
The total number of samples ingested
Shown as request
gitlab.prometheus_local_storage_maintain_series_duration_seconds
(gauge)
The duration in seconds it took to perform maintenance on a series
Shown as request
gitlab.prometheus_local_storage_memory_chunkdescs
(gauge)
The current number of chunk descriptors in memory
Shown as request
gitlab.prometheus_local_storage_memory_chunks
(gauge)
The current number of chunks in memory. The number does not include cloned chunks (i.e. chunks without a descriptor)
Shown as request
gitlab.prometheus_local_storage_memory_dirty_series
(gauge)
The current number of series that would require a disk seek during crash recovery
Shown as request
gitlab.prometheus_local_storage_memory_series
(gauge)
The current number of series in memory
Shown as request
gitlab.prometheus_local_storage_non_existent_series_matches_total
(count)
How often a non-existent series was referred to during label matching or chunk preloading. This is an indication of outdated label indexes
Shown as request
gitlab.prometheus_local_storage_open_head_chunks
(gauge)
The current number of open head chunks
Shown as request
gitlab.prometheus_local_storage_out_of_order_samples_total
(count)
The total number of samples that were discarded because their timestamps were at or before the last received sample for a series
Shown as request
gitlab.prometheus_local_storage_persist_errors_total
(count)
The total number of errors while writing to the persistence layer
Shown as request
gitlab.prometheus_local_storage_persistence_urgency_score
(gauge)
A score of urgency to persist chunks. 0 is least urgent and 1 most
Shown as request
gitlab.prometheus_local_storage_queued_chunks_to_persist_total
(count)
The total number of chunks queued for persistence
Shown as request
gitlab.prometheus_local_storage_rushed_mode
(gauge)
1 if the storage is in rushed mode and 0 otherwise
Shown as request
gitlab.prometheus_local_storage_series_chunks_persisted
(gauge)
The number of chunks persisted per series
Shown as request
gitlab.prometheus_local_storage_series_ops_total
(count)
The total number of series operations by their type
Shown as request
gitlab.prometheus_local_storage_started_dirty
(gauge)
Whether the local storage was found to be dirty (and crash recovery occurred) during Prometheus startup
Shown as request
gitlab.prometheus_local_storage_target_heap_size_bytes
(gauge)
The configured target heap size in bytes
Shown as byte
gitlab.prometheus_notifications_alertmanagers_discovered
(gauge)
The number of alertmanagers discovered and active
Shown as request
gitlab.prometheus_notifications_dropped_total
(count)
Total number of alerts dropped due to errors when sending to Alertmanager
Shown as request
gitlab.prometheus_notifications_queue_capacity
(gauge)
The capacity of the alert notifications queue
Shown as request
gitlab.prometheus_notifications_queue_length
(gauge)
The number of alert notifications in the queue
Shown as request
gitlab.prometheus_rule_evaluation_failures_total
(gauge)
The total number of rule evaluation failures
Shown as request
gitlab.prometheus_sd_azure_refresh_duration_seconds
(gauge)
The duration of a Azure-SD refresh in seconds
Shown as request
gitlab.prometheus_sd_azure_refresh_failures_total
(count)
Number of Azure-SD refresh failures
Shown as request
gitlab.prometheus_sd_consul_rpc_duration_seconds
(gauge)
The duration of a Consul RPC call in seconds
Shown as request
gitlab.prometheus_sd_consul_rpc_failures_total
(count)
The number of Consul RPC call failures
Shown as request
gitlab.prometheus_sd_dns_lookup_failures_total
(count)
The number of DNS-SD lookup failures
Shown as request
gitlab.prometheus_sd_dns_lookups_total
(count)
The number of DNS-SD lookups
Shown as request
gitlab.prometheus_sd_ec2_refresh_duration_seconds
(gauge)
The duration of a EC2-SD refresh in seconds
Shown as request
gitlab.prometheus_sd_ec2_refresh_failures_total
(count)
The number of EC2-SD scrape failures
Shown as request
gitlab.prometheus_sd_file_read_errors_total
(count)
The number of File-SD read errors
Shown as request
gitlab.prometheus_sd_file_scan_duration_seconds
(gauge)
The duration of the File-SD scan in seconds
Shown as request
gitlab.prometheus_sd_gce_refresh_duration
(gauge)
The duration of a GCE-SD refresh in seconds
Shown as request
gitlab.prometheus_sd_gce_refresh_failures_total
(count)
The number of GCE-SD refresh failures
Shown as request
gitlab.prometheus_sd_kubernetes_events_total
(count)
The number of Kubernetes events handled
Shown as request
gitlab.prometheus_sd_marathon_refresh_duration_seconds
(gauge)
The duration of a Marathon-SD refresh in seconds
Shown as request
gitlab.prometheus_sd_marathon_refresh_failures_total
(count)
The number of Marathon-SD refresh failures
Shown as request
gitlab.prometheus_sd_openstack_refresh_duration_seconds
(gauge)
The duration of an OpenStack-SD refresh in seconds
Shown as request
gitlab.prometheus_sd_openstack_refresh_failures_total
(count)
The number of OpenStack-SD scrape failures
Shown as request
gitlab.prometheus_sd_triton_refresh_duration_seconds
(gauge)
The duration of a Triton-SD refresh in seconds
Shown as request
gitlab.prometheus_sd_triton_refresh_failures_total
(count)
The number of Triton-SD scrape failures
Shown as request
gitlab.prometheus_target_interval_length_seconds
(gauge)
Actual intervals between scrapes
Shown as request
gitlab.prometheus_target_scrape_pool_sync_total
(count)
Total number of syncs that were executed on a scrape pool
Shown as request
gitlab.prometheus_target_scrapes_exceeded_sample_limit_total
(gauge)
Total number of scrapes that hit the sample limit and were rejected
Shown as request
gitlab.prometheus_target_skipped_scrapes_total
(count)
Total number of scrapes that were skipped because the metric storage was throttled
Shown as request
gitlab.prometheus_target_sync_length_seconds
(gauge)
Actual interval to sync the scrape pool
Shown as request
gitlab.prometheus_treecache_watcher_goroutines
(gauge)
The current number of watcher goroutines
Shown as request
gitlab.prometheus_treecache_zookeeper_failures_total
(count)
The total number of ZooKeeper failures
Shown as request

Events

The Gitlab check does not include any events.

Service Checks

The Gitlab check includes health, readiness, and liveness service checks.

gitlab.prometheus_endpoint_up: Returns CRITICAL if the check cannot access the Prometheus metrics endpoint of the Gitlab instance. gitlab.health: Returns CRITICAL if the check cannot access the Gitlab instance. gitlab.liveness: Returns CRITICAL if the check cannot access the Gitlab instance due to deadlock with Rails Controllers. gitlab.readiness: Returns CRITICAL if the Gitlab instance is able to accept traffic via Rails Controllers.

Troubleshooting

Need help? Contact Datadog support.

Gitlab Runner Integration

Overview

Integration that allows to:

  • Visualize and monitor metrics collected via Gitlab Runners through Prometheus
  • Validate that the Gitlab Runner can connect to Gitlab

See the Gitlab Runner documentation for more information about Gitlab Runner and its integration with Prometheus

Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.

Installation

The Gitlab Runner check is included in the Datadog Agent package, so you don’t need to install anything else on your Gitlab servers.

Configuration

Edit the gitlab_runner.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory, to point to the Runner’s Prometheus metrics endpoint and to the Gitlab master to have a service check. See the sample gitlab_runner.d/conf.yaml for all available configuration options.

Note: The allowed_metrics item in the init_config section allows to specify the metrics that should be extracted.

Remarks: Some metrics should be reported as rate (i.e., ci_runner_errors)

Validation

Run the Agent’s status subcommand and look for gitlab_runner under the Checks section.

Data Collected

Metrics

gitlab_runner.ci_docker_machines_provider_machine_creation_duration_seconds_bucket
(gauge)
Histogram of Docker machine creation time. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_docker_machines_provider_machine_creation_duration_seconds_sum
(gauge)
Sum of Docker machine creation time. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_docker_machines_provider_machine_creation_duration_seconds_count
(gauge)
Count of Docker machine creation time. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_docker_machines_provider_machine_states
(gauge)
The current number of machines per state in this provider. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_runner_builds
(gauge)
The current number of running builds. Applies to GitLab Runner < 1.11.0
gitlab_runner.ci_runner_errors
(count)
The number of caught errors. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_runner_version_info
(gauge)
A metric with a constant '1' value labeled by different build stats fields. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_ssh_docker_machines_provider_machine_creation_duration_seconds_bucket
(gauge)
Histogram of SSH Docker machine creation time. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_ssh_docker_machines_provider_machine_creation_duration_seconds_sum
(gauge)
Sum of SSH Docker machine creation time. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_ssh_docker_machines_provider_machine_creation_duration_seconds_count
(gauge)
Count of SSH Docker machine creation time. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_ssh_docker_machines_provider_machine_states
(gauge)
The current number of machines per state in this provider. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.gitlab_runner_autoscaling_machine_creation_duration_seconds
(gauge)
Histogram of Docker machine creation time. Applies to GitLab Runner 1.11.0+
Shown as request
gitlab_runner.gitlab_runner_autoscaling_machine_states
(gauge)
The current number of machines per state in this provider. Applies to GitLab Runner 1.11.0+
Shown as request
gitlab_runner.gitlab_runner_jobs
(gauge)
The current number of running builds. Applies to GitLab Runner 1.11.0+
gitlab_runner.gitlab_runner_errors_total
(count)
The number of caught errors. Applies to GitLab Runner 1.11.0+
Shown as request
gitlab_runner.gitlab_runner_version_info
(gauge)
A metric with a constant '1' value labeled by different build stats fields. Applies to GitLab Runner 1.11.0+
Shown as request
gitlab_runner.go_gc_duration_seconds
(gauge)
A summary of the GC invocation durations
Shown as request
gitlab_runner.go_gc_duration_seconds_sum
(gauge)
Sum of the GC invocation durations
Shown as request
gitlab_runner.go_gc_duration_seconds_count
(gauge)
Count of the GC invocation durations
Shown as request
gitlab_runner.go_goroutines
(gauge)
Number of goroutines that currently exist
Shown as request
gitlab_runner.go_memstats_alloc_bytes
(gauge)
Number of bytes allocated and still in use
Shown as byte
gitlab_runner.go_memstats_alloc_bytes_total
(count)
Total number of bytes allocated
Shown as byte
gitlab_runner.go_memstats_buck_hash_sys_bytes
(gauge)
Number of bytes used by the profiling bucket hash table
Shown as byte
gitlab_runner.go_memstats_frees_total
(count)
Total number of frees
Shown as request
gitlab_runner.go_memstats_gc_sys_bytes
(gauge)
Number of bytes used for garbage collection system metadata
Shown as byte
gitlab_runner.go_memstats_heap_alloc_bytes
(gauge)
Number of heap bytes allocated and still in use
Shown as byte
gitlab_runner.go_memstats_heap_idle_bytes
(gauge)
Number of heap bytes waiting to be used
Shown as byte
gitlab_runner.go_memstats_heap_inuse_bytes
(gauge)
Number of heap bytes that are in use
Shown as byte
gitlab_runner.go_memstats_heap_objects
(gauge)
Number of allocated objects
Shown as request
gitlab_runner.go_memstats_heap_released_bytes_total
(count)
Total number of heap bytes released to OS
Shown as byte
gitlab_runner.go_memstats_heap_sys_bytes
(gauge)
Number of heap bytes obtained from system
Shown as byte
gitlab_runner.go_memstats_last_gc_time_seconds
(gauge)
Number of seconds since 1970 of last garbage collection
Shown as request
gitlab_runner.go_memstats_lookups_total
(count)
Total number of pointer lookups
Shown as request
gitlab_runner.go_memstats_mallocs_total
(count)
Total number of mallocs
Shown as request
gitlab_runner.go_memstats_mcache_inuse_bytes
(gauge)
Number of bytes in use by mcache structures
Shown as byte
gitlab_runner.go_memstats_mcache_sys_bytes
(gauge)
Number of bytes used for mcache structures obtained from system
Shown as byte
gitlab_runner.go_memstats_mspan_inuse_bytes
(gauge)
Number of bytes in use by mspan structures
Shown as byte
gitlab_runner.go_memstats_mspan_sys_bytes
(gauge)
Number of bytes used for mspan structures obtained from system
Shown as byte
gitlab_runner.go_memstats_next_gc_bytes
(gauge)
Number of heap bytes when next garbage collection will take place
Shown as byte
gitlab_runner.go_memstats_other_sys_bytes
(gauge)
Number of bytes used for other system allocations
Shown as byte
gitlab_runner.go_memstats_stack_inuse_bytes
(gauge)
Number of bytes in use by the stack allocator
Shown as byte
gitlab_runner.go_memstats_stack_sys_bytes
(gauge)
Number of bytes obtained from system for stack allocator
Shown as byte
gitlab_runner.go_memstats_sys_bytes
(gauge)
Number of bytes obtained by system. Sum of all system allocations
Shown as byte
gitlab_runner.process_cpu_seconds_total
(count)
Total user and system CPU time spent in seconds
Shown as request
gitlab_runner.process_max_fds
(gauge)
Maximum number of open file descriptors
Shown as request
gitlab_runner.process_open_fds
(gauge)
Number of open file descriptors
Shown as request
gitlab_runner.process_resident_memory_bytes
(gauge)
Resident memory size in bytes
Shown as byte
gitlab_runner.process_start_time_seconds
(gauge)
Start time of the process since unix epoch in seconds
Shown as request
gitlab_runner.process_virtual_memory_bytes
(gauge)
Virtual memory size in bytes
Shown as byte

Events

The Gitlab Runner check does not include any events.

Service Checks

The Gitlab Runner check provides a service check to ensure that the Runner can talk to the Gitlab master and another one to ensure that the local Prometheus endpoint is available.

Troubleshooting

Need help? Contact Datadog support.