Gitlab
Security Monitoring is now available Security Monitoring is now available

Gitlab

Agent Check Agent Check

Supported OS: Linux Mac OS Windows

Overview

Integration that allows to:

  • Visualize and monitor metrics collected via Gitlab through Prometheus

See the Gitlab documentation for more information about Gitlab and its integration with Prometheus.

Setup

Installation

The Gitlab check is included in the Datadog Agent package, so you don’t need to install anything else on your Gitlab servers.

Configuration

Host

Follow the instructions below to configure this check for an Agent running on a host. For containerized environments, see the Containerized section.

Metric collection
  1. Edit the gitlab.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory, to point to the Gitlab’s metrics endpoint. See the sample gitlab.d/conf.yaml for all available configuration options.

  2. In the Gitlab settings page, ensure that the option Enable Prometheus Metrics is enabled. You will need to have administrator access. For more information on how to enable metric collection, see the Gitlab documentation.

  3. Allow access to monitoring endpoints by updating your /etc/gitlab/gitlab.rb to include the following line:

    gitlab_rails['monitoring_whitelist'] = ['127.0.0.0/8', '192.168.0.1']

    Note Save and restart Gitlab to see the changes.

  4. Restart the Agent.

Note: The metrics in gitlab/metrics.py are collected by default. The allowed_metrics configuration option in the init_config collects specific legacy metrics. Some metrics may not be collected depending on your Gitlab instance version and configuration. See Gitlab’s documentation for further information about its metric collection.

Log collection
  1. Collecting logs is disabled by default in the Datadog Agent, enable it in your datadog.yaml file:

    logs_enabled: true
  2. Next, edit gitlab.d/conf.yaml by uncommenting the logs lines at the bottom. Update the logs path with the correct path to your Gitlab log files.

     logs:
       - type: file
         path: /var/log/gitlab/gitlab-rails/production_json.log
         service: '<SERVICE_NAME>'
         source: gitlab
       - type: file
         path: /var/log/gitlab/gitlab-rails/production.log
         service: '<SERVICE_NAME>'
         source: gitlab
       - type: file
         path: /var/log/gitlab/gitlab-rails/api_json.log
         service: '<SERVICE_NAME>'
         source: gitlab
  3. Restart the Agent.

Containerized

For containerized environments, see the Autodiscovery Integration Templates for guidance on applying the parameters below.

Metric collection
ParameterValue
<INTEGRATION_NAME>gitlab
<INIT_CONFIG>blank or {}
<INSTANCE_CONFIG>{"gitlab_url":"http://%%host%%/", "prometheus_endpoint":"http://%%host%%:10055/-/metrics"}
Log collection

Collecting logs is disabled by default in the Datadog Agent. To enable it, see Kubernetes log collection documentation.

ParameterValue
<LOG_CONFIG>{"source": "gitlab", "service": "gitlab"}

Validation

Run the Agent’s status subcommand and look for gitlab under the Checks section.

Data Collected

Metrics

gitlab.banzai.cached_render_real_duration_seconds.count
(count)
The count of duration of rendering Markdown into HTML when cached output exists
Shown as second
gitlab.banzai.cached_render_real_duration_seconds.sum
(gauge)
The sum of duration of rendering Markdown into HTML when cached output exists
Shown as second
gitlab.banzai.cacheless_render_real_duration_seconds.count
(count)
The count of duration of rendering Markdown into HTML when cached output does not exist
Shown as second
gitlab.banzai.cacheless_render_real_duration_seconds.sum
(gauge)
The sum of duration of rendering Markdown into HTML when cached output does not exist
Shown as second
gitlab.cache.misses_total
(count)
The cache read miss count
Shown as second
gitlab.cache.operation_duration_seconds.count
(count)
The count of cache access time
Shown as second
gitlab.cache.operation_duration_seconds.sum
(gauge)
The sum of cache access time
Shown as second
gitlab.cache_operations_total
(count)
The count of cache operations by controller/action
gitlab.job.waiter_started_total
(count)
The number of batches of jobs started where a web request is waiting for the jobs to complete
Shown as job
gitlab.job.waiter_timeouts_total
(count)
The number of batches of jobs that timed out where a web request is waiting for the jobs to complete
Shown as job
gitlab.database.transaction_seconds.count
(count)
The count of time spent in database transactions in seconds
Shown as second
gitlab.database.transaction_seconds.sum
(gauge)
The sum of time spent in database transactions in seconds
Shown as second
gitlab.method_call_duration_seconds.count
(count)
The count of method calls real duration
Shown as second
gitlab.method_call_duration_seconds.sum
(gauge)
The sum of method calls real duration
Shown as second
gitlab.page_out_of_bounds
(count)
The counter for the PageLimiter pagination limit being hit
gitlab.rails_queue_duration_seconds.count
(count)
The counter for latency between GitLab Workhorse forwarding a request to Rails
Shown as second
gitlab.rails_queue_duration_seconds.sum
(gauge)
The sum of latency between GitLab Workhorse forwarding a request to Rails
Shown as second
gitlab.sql_duration_seconds.count
(count)
The total SQL execution time, excluding SCHEMA operations and BEGIN / COMMIT
Shown as second
gitlab.sql_duration_seconds.sum
(gauge)
The sum of SQL execution time, excluding SCHEMA operations and BEGIN / COMMIT
Shown as second
gitlab.transaction.allocated_memory_bytes.count
(count)
The count of allocated memory for all transactions (gitlab_transaction_* metrics)
Shown as byte
gitlab.transaction.allocated_memory_bytes.sum
(gauge)
The sum of allocated memory for all transactions (gitlab_transaction_* metrics)
Shown as byte
gitlab.transaction.cache_count_total
(count)
The counter for total Rails cache calls (aggregate)
gitlab.transaction.cache_duration_total
(count)
The counter for total time (seconds) spent in Rails cache calls (aggregate)
Shown as second
gitlab.transaction.cache_read_hit_count_total
(count)
The counter for cache hits for Rails cache calls
Shown as hit
gitlab.transaction.cache_read_miss_count_total
(count)
The counter for cache misses for Rails cache calls
Shown as miss
gitlab.transaction.duration_seconds.count
(count)
The count of duration for all transactions (gitlab_transaction_* metrics)
Shown as second
gitlab.transaction.duration_seconds.sum
(gauge)
The sum of duration for all transactions (gitlab_transaction_* metrics)
Shown as second
gitlab.transaction.event_build_found_total
(count)
The counter for build found for API /jobs/request
gitlab.transaction.event_build_invalid_total
(count)
The counter for build invalid due to concurrency conflict for API /jobs/request
gitlab.transaction.event_build_not_found_cached_total
(count)
The counter for cached response of build not found for API /jobs/request
gitlab.transaction.event_build_not_found_total
(count)
The counter for build not found for API /jobs/request
gitlab.transaction.event_change_default_branch_total
(count)
The counter when default branch is changed for any repository
gitlab.transaction.event_create_repository_total
(count)
The counter when any repository is created
gitlab.transaction.event_etag_caching_cache_hit_total
(count)
The counter for etag cache hit.
Shown as hit
gitlab.transaction.event_etag_caching_header_missing_total
(count)
The counter for etag cache miss - header missing
Shown as miss
gitlab.transaction.event_etag_caching_key_not_found_total
(count)
The counter for etag cache miss - key not found
Shown as miss
gitlab.transaction.event_etag_caching_middleware_used_total
(count)
The counter for etag middleware accessed
gitlab.transaction.event_etag_caching_resource_changed_total
(count)
The counter for etag cache miss - resource changed
gitlab.transaction.event_fork_repository_total
(count)
The counter for repository forks (RepositoryForkWorker). Only incremented when source repository exists
gitlab.transaction.event_import_repository_total
(count)
The counter for repository imports (RepositoryImportWorker)
gitlab.transaction.event_push_branch_total
(count)
The counter for all branch pushes
gitlab.transaction.event_push_commit_total
(count)
The counter for commits
gitlab.transaction.event_push_tag_total
(count)
The counter for tag pushes
gitlab.transaction.event_rails_exception_total
(count)
The counter for number of rails exceptions
gitlab.transaction.event_receive_email_total
(count)
The counter for received emails
Shown as email
gitlab.transaction.event_remote_mirrors_failed_total
(count)
The counter for failed remote mirrors
gitlab.transaction.event_remote_mirrors_finished_total
(count)
The counter for finished remote mirrors
gitlab.transaction.event_remote_mirrors_running_total
(count)
The counter for running remote mirrors
gitlab.transaction.event_remove_branch_total
(count)
The counter when a branch is removed for any repository
gitlab.transaction.event_remove_repository_total
(count)
The counter when a repository is removed
gitlab.transaction.event_remove_tag_total
(count)
The counter when a tag is remove for any repository
gitlab.transaction.event_sidekiq_exception_total
(count)
The counter of Sidekiq exceptions
gitlab.transaction.event_stuck_import_jobs_total
(count)
The count of stuck import jobs
gitlab.transaction.event_update_build_total
(count)
The counter for update build for API /jobs/request/:id
gitlab.transaction.new_redis_connections_total
(count)
The counter for new Redis connections
Shown as connection
gitlab.transaction.queue_duration_total
(count)
The duration jobs were enqueued before processing
gitlab.transaction.rails_queue_duration_total
(gauge)
The latency between GitLab Workhorse forwarding a request to Rails
gitlab.transaction.view_duration_total
(count)
The duration for views
gitlab.view_rendering_duration_seconds.count
(count)
The count of duration for views (histogram)
Shown as second
gitlab.view_rendering_duration_seconds.sum
(count)
The sum of duration for views (histogram)
Shown as second
gitlab.rack.http_requests_total.count
(count)
The rack request count
Shown as request
gitlab.rack.http_requests_total.sum
(gauge)
The sum of rack requests
Shown as request
gitlab.rack.http_request_duration_seconds.sum
(gauge)
The sum of HTTP response time from rack middleware
Shown as second
gitlab.rack.http_request_duration_seconds.count
(count)
The count of HTTP response time from rack middleware
Shown as second
gitlab.rack.uncaught_errors_total
(count)
The count of rack connections handling uncaught errors
Shown as connection
gitlab.pipelines_created_total
(count)
The counter of pipelines created
gitlab.user_session_logins_total
(count)
The counter of how many users have logged in
gitlab.upload_file_does_not_exist
(count)
The number of times an upload record could not find its file
gitlab.failed_login_captcha_total
(gauge)
The counter of failed CAPTCHA attempts during login
gitlab.successful_login_captcha_total
(gauge)
The counter of successful CAPTCHA attempts during login
gitlab.auto_devops_pipelines_completed_total
(count)
The counter of completed Auto DevOps pipelines, labeled by status
gitlab.sidekiq.jobs_cpu_seconds.count
(count)
The count of seconds of cpu time to run Sidekiq job
Shown as second
gitlab.sidekiq.jobs_cpu_seconds.sum
(gauge)
The sum of seconds of cpu time to run Sidekiq job
Shown as second
gitlab.sidekiq.jobs_completion_seconds.count
(count)
The count of seconds to complete Sidekiq job
Shown as second
gitlab.sidekiq.jobs_db_second.count
(count)
The count of seconds of DB time to run Sidekiq job
Shown as second
gitlab.sidekiq.jobs_db_second.sum
(gauge)
The sum of seconds of DB time to run Sidekiq job
Shown as second
gitlab.sidekiq.jobs_gitaly_seconds.count
(count)
The count of seconds of Gitaly time to run Sidekiq job
Shown as second
gitlab.sidekiq.jobs_gitaly_seconds.sum
(gauge)
The sum of seconds of Gitaly time to run Sidekiq job
Shown as second
gitlab.sidekiq.jobs_completion_seconds.sum
(gauge)
The sum of seconds to complete Sidekiq job
Shown as second
gitlab.sidekiq.jobs_queue_duration_seconds.count
(count)
The count of duration in seconds that a Sidekiq job was queued before being executed
Shown as second
gitlab.sidekiq.jobs_queue_duration_seconds.sum
(gauge)
The sum of duration in seconds that a Sidekiq job was queued before being executed
Shown as second
gitlab.sidekiq.jobs_failed_total
(count)
The number of failed sidekiq jobs
Shown as job
gitlab.sidekiq.jobs_retried_total
(count)
The number of retried sidekiq jobs
Shown as job
gitlab.sidekiq.running_jobs
(gauge)
The number of running sidekiq jobs
Shown as job
gitlab.sidekiq.concurrency
(gauge)
The maximum number of Sidekiq jobs
Shown as job
gitlab.ruby.gc_duration_seconds.count
(count)
The count of time spent by Ruby in GC
Shown as second
gitlab.ruby.gc_duration_seconds.sum
(gauge)
The sum of time spent by Ruby in GC
Shown as second
gitlab.ruby.file_descriptors
(gauge)
The number of file descriptors per process
gitlab.ruby.memory_bytes
(gauge)
The memory usage
Shown as byte
gitlab.ruby.sampler_duration_seconds_total
(count)
The time spent collecting stats
Shown as second
gitlab.ruby.process_cpu_seconds_total
(gauge)
The total amount of CPU time per process
Shown as second
gitlab.ruby.process_max_fds
(gauge)
The maximum number of open file descriptors per process
gitlab.ruby.process_resident_memory_bytes
(gauge)
The memory usage by process
Shown as byte
gitlab.ruby.process_start_time_seconds
(gauge)
The UNIX timestamp of process start time
Shown as second
gitlab.ruby.gc_stat.count
(gauge)
The number of ruby garbage collectors
gitlab.ruby.gc_stat.heap_allocated_pages
(gauge)
The number of currently allocated heap pages
Shown as page
gitlab.ruby.gc_stat.heap_sorted_length
(gauge)
The length of the heap in memory
gitlab.ruby.gc_stat.heap_allocatable_pages
(gauge)
The number malloced pages that can be used
Shown as page
gitlab.ruby.gc_stat.heap_available_slots
(gauge)
The number of slots in heap pages
gitlab.ruby.gc_stat.heap_live_slots
(gauge)
The number of live slots in heap
gitlab.ruby.gc_stat.heap_free_slots
(gauge)
The number of empty slots in heap
gitlab.ruby.gc_stat.heap_final_slots
(gauge)
The number of slots in heap with finalizers
gitlab.ruby.gc_stat.heap_marked_slots
(gauge)
The number of slots that are marked, or old
Shown as page
gitlab.ruby.gc_stat.heap_eden_pages
(gauge)
The number of heap pages that contain a live object
Shown as page
gitlab.ruby.gc_stat.heap_tomb_pages
(gauge)
The number of heap pages that do not contain a live object
Shown as page
gitlab.ruby.gc_stat.total_allocated_pages
(gauge)
The number of pages allocated
Shown as page
gitlab.ruby.gc_stat.total_freed_pages
(gauge)
The number of pages freed
Shown as page
gitlab.ruby.gc_stat.total_allocated_objects
(gauge)
The total number of allocated objects
gitlab.ruby.gc_stat.total_freed_objects
(gauge)
The number of freed objects
gitlab.ruby.gc_stat.malloc_increase_bytes
(gauge)
The number of bytes allocated outside of the heap
Shown as byte
gitlab.ruby.gc_stat.malloc_increase_bytes_limit
(gauge)
The limit to how many bytes can be allocated outside of the heap
Shown as byte
gitlab.ruby.gc_stat.minor_gc_count
(gauge)
The number of minor garbage collectors
Shown as garbage collection
gitlab.ruby.gc_stat.major_gc_count
(gauge)
The number of major garbage collectors
Shown as garbage collection
gitlab.ruby.gc_stat.remembered_wb_unprotected_objects
(gauge)
The number of old objects that reference new objects
gitlab.ruby.gc_stat.remembered_wb_unprotected_objects_limit
(gauge)
The limit of wb ubprotected objects
gitlab.ruby.gc_stat.old_objects
(gauge)
The number of old objects
gitlab.ruby.gc_stat.old_objects_limit
(gauge)
The limit of number of old objects
gitlab.ruby.gc_stat.oldmalloc_increase_bytes
(gauge)
The number of bytes allocated outside of the heap for old objects
Shown as byte
gitlab.ruby.gc_stat.oldmalloc_increase_bytes_limit
(gauge)
The limit of how many bytes can be allocated outside of the heap for old objects
Shown as byte
gitlab.geo.db_replication_lag_seconds
(gauge)
The database replication lag (seconds)
Shown as second
gitlab.geo.repositories
(gauge)
The total number of repositories available on primary
gitlab.geo.repositories_synced
(gauge)
The number of repositories synced on secondary
gitlab.geo.repositories_failed
(gauge)
The number of repositories failed to sync on secondary
gitlab.geo.lfs_objects
(gauge)
The total number of LFS objects available on primary
gitlab.geo.lfs_objects_synced
(gauge)
The number of LFS objects synced on secondary
gitlab.geo.lfs_objects_failed
(gauge)
The number of LFS objects failed to sync on secondary
gitlab.geo.attachments
(gauge)
The total number of file attachments available on primary
gitlab.geo.attachments_synced
(gauge)
The number of attachments synced on secondary
gitlab.geo.attachments_failed
(gauge)
The number of attachments failed to sync on secondary
gitlab.geo.last_event_id
(gauge)
The database ID of the latest event log entry on the primary
gitlab.geo.last_event_timestamp
(gauge)
The UNIX timestamp of the latest event log entry on the primary
gitlab.geo.cursor_last_event_id
(gauge)
The last database ID of the event log processed by the secondary
gitlab.geo.cursor_last_event_timestamp
(gauge)
The last UNIX timestamp of the event log processed by the secondary
gitlab.geo.status_failed_total
(count)
The number of times retrieving the status from the Geo Node failed
gitlab.geo.last_successful_status_check_timestamp
(gauge)
The last timestamp when the status was successfully updated
gitlab.geo.lfs_objects_synced_missing_on_primary
(gauge)
The number of LFS objects marked as synced due to the file missing on the primary
gitlab.geo.job_artifacts_synced_missing_on_primary
(gauge)
The number of job artifacts marked as synced due to the file missing on the primary
gitlab.geo.attachments_synced_missing_on_primary
(gauge)
The number of attachments marked as synced due to the file missing on the primary
gitlab.geo.repositories_checksummed_count
(gauge)
The number of repositories checksummed on primary
gitlab.geo.repositories_checksum_failed_count
(gauge)
The number of repositories failed to calculate the checksum on primary
gitlab.geo.wikis_checksummed_count
(gauge)
The number of wikis checksummed on primary
gitlab.geo.wikis_checksum_failed_count
(gauge)
The number of wikis failed to calculate the checksum on primary
gitlab.geo.repositories_verified_count
(gauge)
The number of repositories verified on secondary
gitlab.geo.repositories_verification_failed_count
(gauge)
The number of repositories failed to verify on secondary
gitlab.geo.repositories_checksum_mismatch_count
(gauge)
The number of repositories that checksum mismatch on secondary
gitlab.geo.wikis_verified_count
(gauge)
The number of wikis verified on secondary
gitlab.geo.wikis_verification_failed_count
(gauge)
The number of wikis failed to verify on secondary
gitlab.geo.wikis_checksum_mismatch_count
(gauge)
The number of wikis that checksum mismatch on secondary
gitlab.geo.repositories_checked_count
(gauge)
The number of repositories that have been checked via git fsck
gitlab.geo.repositories_checked_failed_count
(gauge)
The number of repositories that have a failure from git fsck
gitlab.geo.repositories_retrying_verification_count
(gauge)
The number of repositories verification failures that Geo is actively trying to correct on secondary
gitlab.geo.wikis_retrying_verification_count
(gauge)
The number of wikis verification failures that Geo is actively trying to correct on secondary
gitlab.db_load_balancing_hosts
(gauge)
The current number of load balancing hosts
Shown as host
gitlab.unicorn.active_connections
(gauge)
The number of active Unicorn connections (workers)
Shown as connection
gitlab.unicorn.queued_connections
(gauge)
The number of queued Unicorn connections
Shown as connection
gitlab.unicorn.workers
(gauge)
The number of Unicorn workers
Shown as worker
gitlab.puma.workers
(gauge)
Total number of puma workers
Shown as worker
gitlab.puma.running_workers
(gauge)
The number of booted puma workers
Shown as worker
gitlab.puma.stale_workers
(gauge)
The number of old puma workers
Shown as worker
gitlab.puma.running
(gauge)
The number of running puma threads
Shown as thread
gitlab.puma.queued_connections
(gauge)
The number of connections in that puma worker's "todo" set waiting for a worker thread
Shown as connection
gitlab.puma.active_connections
(gauge)
The number of puma threads processing a request
Shown as thread
gitlab.puma.pool_capacity
(gauge)
The number of requests the puma worker is capable of taking right now
Shown as request
gitlab.puma.max_threads
(gauge)
The maximum number of puma worker threads
Shown as thread
gitlab.puma.idle_threads
(gauge)
The number of spawned puma threads which are not processing a request
Shown as thread
gitlab.puma.killer_terminations_total
(gauge)
The number of workers terminated by PumaWorkerKiller
Shown as worker
gitlab.go_gc_duration_seconds
(gauge)
A summary of the GC invocation durations
Shown as request
gitlab.go_gc_duration_seconds_sum
(gauge)
The sum of the GC invocation durations
Shown as request
gitlab.go_gc_duration_seconds_count
(gauge)
The count of the GC invocation durations
Shown as request
gitlab.go_goroutines
(gauge)
The number of goroutines that currently exist
Shown as request
gitlab.go_memstats_alloc_bytes
(gauge)
The number of bytes allocated and still in use
Shown as byte
gitlab.go_memstats_alloc_bytes_total
(count)
The total number of bytes allocated
Shown as byte
gitlab.go_memstats_buck_hash_sys_bytes
(gauge)
The number of bytes used by the profiling bucket hash table
Shown as byte
gitlab.go_memstats_frees_total
(count)
The total number of frees
Shown as request
gitlab.go_memstats_gc_cpu_fraction
(gauge)
The fraction of this program's available CPU time used by the GC since the program started
Shown as request
gitlab.go_memstats_gc_sys_bytes
(gauge)
The number of bytes used for garbage collection system metadata
Shown as byte
gitlab.go_memstats_heap_alloc_bytes
(gauge)
The number of heap bytes allocated and still in use
Shown as byte
gitlab.go_memstats_heap_idle_bytes
(gauge)
The number of heap bytes waiting to be used
Shown as byte
gitlab.go_memstats_heap_inuse_bytes
(gauge)
The number of heap bytes that are in use
Shown as byte
gitlab.go_memstats_heap_objects
(gauge)
The number of allocated objects
Shown as request
gitlab.go_memstats_heap_released_bytes_total
(count)
The total number of heap bytes released to OS
Shown as byte
gitlab.go_memstats_heap_sys_bytes
(gauge)
The number of heap bytes obtained from system
Shown as byte
gitlab.go_memstats_last_gc_time_seconds
(gauge)
The number of seconds since 1970 of last garbage collection
Shown as request
gitlab.go_memstats_lookups_total
(count)
The total number of pointer lookups
Shown as request
gitlab.go_memstats_mallocs_total
(count)
The total number of mallocs
Shown as request
gitlab.go_memstats_mcache_inuse_bytes
(gauge)
The number of bytes in use by mcache structures
Shown as byte
gitlab.go_memstats_mcache_sys_bytes
(gauge)
The number of bytes used for mcache structures obtained from system
Shown as byte
gitlab.go_memstats_mspan_inuse_bytes
(gauge)
The number of bytes in use by mspan structures
Shown as byte
gitlab.go_memstats_mspan_sys_bytes
(gauge)
The number of bytes used for mspan structures obtained from system
Shown as byte
gitlab.go_memstats_next_gc_bytes
(gauge)
The number of heap bytes when next garbage collection will take place
Shown as byte
gitlab.go_memstats_other_sys_bytes
(gauge)
The number of bytes used for other system allocations
Shown as byte
gitlab.go_memstats_stack_inuse_bytes
(gauge)
The number of bytes in use by the stack allocator
Shown as byte
gitlab.go_memstats_stack_sys_bytes
(gauge)
The number of bytes obtained from system for stack allocator
Shown as byte
gitlab.go_memstats_sys_bytes
(gauge)
The number of bytes obtained by system. Sum of all system allocations
Shown as byte
gitlab.go_threads
(gauge)
The number of OS threads create
Shown as request
gitlab.http_request_duration_microseconds
(gauge)
The HTTP request latencies in microseconds
Shown as request
gitlab.http_request_size_bytes
(gauge)
The HTTP request sizes in bytes
Shown as byte
gitlab.http_requests_total
(count)
The total number of HTTP requests made
Shown as request
gitlab.http_response_size_bytes
(gauge)
The HTTP response sizes in bytes
Shown as byte
gitlab.process_cpu_seconds_total
(count)
The total user and system CPU time spent in seconds
Shown as request
gitlab.process_max_fds
(gauge)
The maximum number of open file descriptors
Shown as request
gitlab.process_open_fds
(gauge)
The number of open file descriptors
Shown as request
gitlab.process_resident_memory_bytes
(gauge)
The resident memory size in bytes
Shown as byte
gitlab.process_start_time_seconds
(gauge)
The start time of the process since unix epoch in seconds
Shown as request
gitlab.process_virtual_memory_bytes
(gauge)
The virtual memory size in bytes
Shown as byte
gitlab.prometheus_build_info
(gauge)
A metric with a constant '1' value labeled by version revision branch and goversion from which prometheus was built
Shown as request
gitlab.prometheus_config_last_reload_success_timestamp_seconds
(gauge)
The timestamp of the last successful configuration reload
Shown as request
gitlab.prometheus_config_last_reload_successful
(gauge)
Whether the last configuration reload attempt was successful
Shown as request
gitlab.prometheus_engine_queries
(gauge)
The current number of queries being executed or waiting
Shown as request
gitlab.prometheus_engine_queries_concurrent_max
(gauge)
The max number of concurrent queries
Shown as request
gitlab.prometheus_engine_query_duration_seconds
(gauge)
The query timing
Shown as request
gitlab.prometheus_evaluator_duration_seconds
(gauge)
The duration of rule group evaluations
Shown as request
gitlab.prometheus_evaluator_iterations_missed_total
(count)
The total number of rule group evaluations missed due to slow rule group evaluation
Shown as request
gitlab.prometheus_evaluator_iterations_skipped_total
(count)
The total number of rule group evaluations skipped due to throttled metric storage
Shown as request
gitlab.prometheus_evaluator_iterations_total
(count)
The total number of scheduled rule group evaluations whether executed missed or skipped
Shown as request
gitlab.prometheus_local_storage_checkpoint_duration_seconds
(gauge)
The duration in seconds taken for checkpointing open chunks and chunks yet to be persisted
Shown as request
gitlab.prometheus_local_storage_checkpoint_last_duration_seconds
(gauge)
The duration in seconds it took to last checkpoint open chunks and chunks yet to be persisted
Shown as request
gitlab.prometheus_local_storage_checkpoint_last_size_bytes
(gauge)
The size of the last checkpoint of open chunks and chunks yet to be persisted
Shown as byte
gitlab.prometheus_local_storage_checkpoint_series_chunks_written
(gauge)
The number of chunk written per series while checkpointing open chunks and chunks yet to be persisted
Shown as request
gitlab.prometheus_local_storage_checkpointing
(gauge)
1 if the storage is checkpointing and 0 otherwise
Shown as request
gitlab.prometheus_local_storage_chunk_ops_total
(count)
The total number of chunk operations by their type
Shown as request
gitlab.prometheus_local_storage_chunks_to_persist
(count)
The current number of chunks waiting for persistence
Shown as request
gitlab.prometheus_local_storage_fingerprint_mappings_total
(count)
The total number of fingerprints being mapped to avoid collisions
Shown as request
gitlab.prometheus_local_storage_inconsistencies_total
(count)
A counter incremented each time an inconsistency in the local storage is detected. If this is greater zero then restart the server as soon as possible
Shown as request
gitlab.prometheus_local_storage_indexing_batch_duration_seconds
(gauge)
The quantiles for batch indexing duration in seconds
Shown as request
gitlab.prometheus_local_storage_indexing_batch_sizes
(gauge)
The quantiles for indexing batch sizes (number of metrics per batch)
Shown as request
gitlab.prometheus_local_storage_indexing_queue_capacity
(gauge)
The capacity of the indexing queue
Shown as request
gitlab.prometheus_local_storage_indexing_queue_length
(gauge)
The number of metrics waiting to be indexed
Shown as request
gitlab.prometheus_local_storage_ingested_samples_total
(count)
The total number of samples ingested
Shown as request
gitlab.prometheus_local_storage_maintain_series_duration_seconds
(gauge)
The duration in seconds it took to perform maintenance on a series
Shown as request
gitlab.prometheus_local_storage_memory_chunkdescs
(gauge)
The current number of chunk descriptors in memory
Shown as request
gitlab.prometheus_local_storage_memory_chunks
(gauge)
The current number of chunks in memory. The number does not include cloned chunks (i.e. chunks without a descriptor)
Shown as request
gitlab.prometheus_local_storage_memory_dirty_series
(gauge)
The current number of series that would require a disk seek during crash recovery
Shown as request
gitlab.prometheus_local_storage_memory_series
(gauge)
The current number of series in memory
Shown as request
gitlab.prometheus_local_storage_non_existent_series_matches_total
(count)
How often a non-existent series was referred to during label matching or chunk preloading. This is an indication of outdated label indexes
Shown as request
gitlab.prometheus_local_storage_open_head_chunks
(gauge)
The current number of open head chunks
Shown as request
gitlab.prometheus_local_storage_out_of_order_samples_total
(count)
The total number of samples that were discarded because their timestamps were at or before the last received sample for a series
Shown as request
gitlab.prometheus_local_storage_persist_errors_total
(count)
The total number of errors while writing to the persistence layer
Shown as request
gitlab.prometheus_local_storage_persistence_urgency_score
(gauge)
A score of urgency to persist chunks. 0 is least urgent and 1 most
Shown as request
gitlab.prometheus_local_storage_queued_chunks_to_persist_total
(count)
The total number of chunks queued for persistence
Shown as request
gitlab.prometheus_local_storage_rushed_mode
(gauge)
1 if the storage is in rushed mode and 0 otherwise
Shown as request
gitlab.prometheus_local_storage_series_chunks_persisted
(gauge)
The number of chunks persisted per series
Shown as request
gitlab.prometheus_local_storage_series_ops_total
(count)
The total number of series operations by their type
Shown as request
gitlab.prometheus_local_storage_started_dirty
(gauge)
Whether the local storage was found to be dirty (and crash recovery occurred) during Prometheus startup
Shown as request
gitlab.prometheus_local_storage_target_heap_size_bytes
(gauge)
The configured target heap size in bytes
Shown as byte
gitlab.prometheus_notifications_alertmanagers_discovered
(gauge)
The number of alertmanagers discovered and active
Shown as request
gitlab.prometheus_notifications_dropped_total
(count)
Total number of alerts dropped due to errors when sending to Alertmanager
Shown as request
gitlab.prometheus_notifications_queue_capacity
(gauge)
The capacity of the alert notifications queue
Shown as request
gitlab.prometheus_notifications_queue_length
(gauge)
The number of alert notifications in the queue
Shown as request
gitlab.prometheus_rule_evaluation_failures_total
(gauge)
The total number of rule evaluation failures
Shown as request
gitlab.prometheus_sd_azure_refresh_duration_seconds
(gauge)
The duration of a Azure-SD refresh in seconds
Shown as request
gitlab.prometheus_sd_azure_refresh_failures_total
(count)
The number of Azure-SD refresh failures
Shown as request
gitlab.prometheus_sd_consul_rpc_duration_seconds
(gauge)
The duration of a Consul RPC call in seconds
Shown as request
gitlab.prometheus_sd_consul_rpc_failures_total
(count)
The number of Consul RPC call failures
Shown as request
gitlab.prometheus_sd_dns_lookup_failures_total
(count)
The number of DNS-SD lookup failures
Shown as request
gitlab.prometheus_sd_dns_lookups_total
(count)
The number of DNS-SD lookups
Shown as request
gitlab.prometheus_sd_ec2_refresh_duration_seconds
(gauge)
The duration of a EC2-SD refresh in seconds
Shown as request
gitlab.prometheus_sd_ec2_refresh_failures_total
(count)
The number of EC2-SD scrape failures
Shown as request
gitlab.prometheus_sd_file_read_errors_total
(count)
The number of File-SD read errors
Shown as request
gitlab.prometheus_sd_file_scan_duration_seconds
(gauge)
The duration of the File-SD scan in seconds
Shown as request
gitlab.prometheus_sd_gce_refresh_duration
(gauge)
The duration of a GCE-SD refresh in seconds
Shown as request
gitlab.prometheus_sd_gce_refresh_failures_total
(count)
The number of GCE-SD refresh failures
Shown as request
gitlab.prometheus_sd_kubernetes_events_total
(count)
The number of Kubernetes events handled
Shown as request
gitlab.prometheus_sd_marathon_refresh_duration_seconds
(gauge)
The duration of a Marathon-SD refresh in seconds
Shown as request
gitlab.prometheus_sd_marathon_refresh_failures_total
(count)
The number of Marathon-SD refresh failures
Shown as request
gitlab.prometheus_sd_openstack_refresh_duration_seconds
(gauge)
The duration of an OpenStack-SD refresh in seconds
Shown as request
gitlab.prometheus_sd_openstack_refresh_failures_total
(count)
The number of OpenStack-SD scrape failures
Shown as request
gitlab.prometheus_sd_triton_refresh_duration_seconds
(gauge)
The duration of a Triton-SD refresh in seconds
Shown as request
gitlab.prometheus_sd_triton_refresh_failures_total
(count)
The number of Triton-SD scrape failures
Shown as request
gitlab.prometheus_target_interval_length_seconds
(gauge)
The actual intervals between scrapes
Shown as request
gitlab.prometheus_target_scrape_pool_sync_total
(count)
The total number of syncs that were executed on a scrape pool
Shown as request
gitlab.prometheus_target_scrapes_exceeded_sample_limit_total
(gauge)
Total number of scrapes that hit the sample limit and were rejected
Shown as request
gitlab.prometheus_target_skipped_scrapes_total
(count)
The total number of scrapes that were skipped because the metric storage was throttled
Shown as request
gitlab.prometheus_target_sync_length_seconds
(gauge)
The actual interval to sync the scrape pool
Shown as request
gitlab.prometheus_treecache_watcher_goroutines
(gauge)
The current number of watcher goroutines
Shown as request
gitlab.prometheus_treecache_zookeeper_failures_total
(count)
The total number of ZooKeeper failures
Shown as request

Events

The Gitlab check does not include any events.

Service Checks

The Gitlab check includes health, readiness, and liveness service checks.

gitlab.prometheus_endpoint_up: Returns CRITICAL if the check cannot access the Prometheus metrics endpoint of the Gitlab instance. gitlab.health: Returns CRITICAL if the check cannot access the Gitlab instance. gitlab.liveness: Returns CRITICAL if the check cannot access the Gitlab instance due to deadlock with Rails Controllers. gitlab.readiness: Returns CRITICAL if the Gitlab instance is able to accept traffic via Rails Controllers.

Troubleshooting

Need help? Contact Datadog support.

Gitlab Runner Integration

Overview

Integration that allows to:

  • Visualize and monitor metrics collected via Gitlab Runners through Prometheus
  • Validate that the Gitlab Runner can connect to Gitlab

See the Gitlab Runner documentation for more information about Gitlab Runner and its integration with Prometheus

Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.

Installation

The Gitlab Runner check is included in the Datadog Agent package, so you don’t need to install anything else on your Gitlab servers.

Configuration

Edit the gitlab_runner.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory, to point to the Runner’s Prometheus metrics endpoint and to the Gitlab master to have a service check. See the sample gitlab_runner.d/conf.yaml for all available configuration options.

Note: The allowed_metrics item in the init_config section allows to specify the metrics that should be extracted.

Remarks: Some metrics should be reported as rate (i.e., ci_runner_errors)

Validation

Run the Agent’s status subcommand and look for gitlab_runner under the Checks section.

Data Collected

Metrics

gitlab_runner.ci_docker_machines_provider_machine_creation_duration_seconds_bucket
(gauge)
A histogram of Docker machine creation time. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_docker_machines_provider_machine_creation_duration_seconds_sum
(gauge)
The sum of Docker machine creation time. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_docker_machines_provider_machine_creation_duration_seconds_count
(gauge)
The count of Docker machine creation time. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_docker_machines_provider_machine_states
(gauge)
The current number of CI machines per state in this provider. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_runner_builds
(gauge)
The current number of running builds. Applies to GitLab Runner < 1.11.0
gitlab_runner.ci_runner_errors
(count)
The number of caught errors. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_runner_version_info
(gauge)
A metric with a constant '1' value labeled by different build stats fields. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_ssh_docker_machines_provider_machine_creation_duration_seconds_bucket
(gauge)
A histogram of SSH Docker machine creation time. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_ssh_docker_machines_provider_machine_creation_duration_seconds_sum
(gauge)
The sum of SSH Docker machine creation time. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_ssh_docker_machines_provider_machine_creation_duration_seconds_count
(gauge)
The count of SSH Docker machine creation time. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.ci_ssh_docker_machines_provider_machine_states
(gauge)
The current number of SSH machines per state in this ssh provider. Applies to GitLab Runner < 1.11.0
Shown as request
gitlab_runner.gitlab_runner_autoscaling_machine_creation_duration_seconds
(gauge)
A histogram of Docker machine creation time. Applies to GitLab Runner 1.11.0+
Shown as request
gitlab_runner.gitlab_runner_autoscaling_machine_states
(gauge)
The current number of machines per state in this provider. Applies to GitLab Runner 1.11.0+
Shown as request
gitlab_runner.gitlab_runner_jobs
(gauge)
The current number of running builds. Applies to GitLab Runner 1.11.0+
gitlab_runner.gitlab_runner_errors_total
(count)
The number of caught errors. Applies to GitLab Runner 1.11.0+
Shown as request
gitlab_runner.gitlab_runner_version_info
(gauge)
A metric with a constant '1' value labeled by different build stats fields. Applies to GitLab Runner 1.11.0+
Shown as request
gitlab_runner.go_gc_duration_seconds
(gauge)
A summary of the GC invocation durations
Shown as request
gitlab_runner.go_gc_duration_seconds_sum
(gauge)
The sum of the GC invocation durations
Shown as request
gitlab_runner.go_gc_duration_seconds_count
(gauge)
The count of the GC invocation durations
Shown as request
gitlab_runner.go_goroutines
(gauge)
The number of goroutines that currently exist
Shown as request
gitlab_runner.go_memstats_alloc_bytes
(gauge)
The number of bytes allocated and still in use
Shown as byte
gitlab_runner.go_memstats_alloc_bytes_total
(count)
The total number of bytes allocated
Shown as byte
gitlab_runner.go_memstats_buck_hash_sys_bytes
(gauge)
The number of bytes used by the profiling bucket hash table
Shown as byte
gitlab_runner.go_memstats_frees_total
(count)
The total number of frees
Shown as request
gitlab_runner.go_memstats_gc_sys_bytes
(gauge)
The number of bytes used for garbage collection system metadata
Shown as byte
gitlab_runner.go_memstats_heap_alloc_bytes
(gauge)
The number of heap bytes allocated and still in use
Shown as byte
gitlab_runner.go_memstats_heap_idle_bytes
(gauge)
The number of heap bytes waiting to be used
Shown as byte
gitlab_runner.go_memstats_heap_inuse_bytes
(gauge)
The number of heap bytes that are in use
Shown as byte
gitlab_runner.go_memstats_heap_objects
(gauge)
The number of allocated objects
Shown as request
gitlab_runner.go_memstats_heap_released_bytes_total
(count)
The total number of heap bytes released to OS
Shown as byte
gitlab_runner.go_memstats_heap_sys_bytes
(gauge)
The number of heap bytes obtained from system
Shown as byte
gitlab_runner.go_memstats_last_gc_time_seconds
(gauge)
The number of seconds since 1970 of last garbage collection
Shown as request
gitlab_runner.go_memstats_lookups_total
(count)
The total number of pointer lookups
Shown as request
gitlab_runner.go_memstats_mallocs_total
(count)
The total number of mallocs
Shown as request
gitlab_runner.go_memstats_mcache_inuse_bytes
(gauge)
The number of bytes in use by mcache structures
Shown as byte
gitlab_runner.go_memstats_mcache_sys_bytes
(gauge)
The number of bytes used for mcache structures obtained from system
Shown as byte
gitlab_runner.go_memstats_mspan_inuse_bytes
(gauge)
The number of bytes in use by mspan structures
Shown as byte
gitlab_runner.go_memstats_mspan_sys_bytes
(gauge)
The number of bytes used for mspan structures obtained from system
Shown as byte
gitlab_runner.go_memstats_next_gc_bytes
(gauge)
The number of heap bytes when next garbage collection will take place
Shown as byte
gitlab_runner.go_memstats_other_sys_bytes
(gauge)
The number of bytes used for other system allocations
Shown as byte
gitlab_runner.go_memstats_stack_inuse_bytes
(gauge)
The number of bytes in use by the stack allocator
Shown as byte
gitlab_runner.go_memstats_stack_sys_bytes
(gauge)
The number of bytes obtained from system for stack allocator
Shown as byte
gitlab_runner.go_memstats_sys_bytes
(gauge)
The number of bytes obtained by system. Sum of all system allocations
Shown as byte
gitlab_runner.process_cpu_seconds_total
(count)
The total user and system CPU time spent in seconds
Shown as request
gitlab_runner.process_max_fds
(gauge)
The maximum number of open file descriptors
Shown as request
gitlab_runner.process_open_fds
(gauge)
The number of open file descriptors
Shown as request
gitlab_runner.process_resident_memory_bytes
(gauge)
The resident memory size in bytes
Shown as byte
gitlab_runner.process_start_time_seconds
(gauge)
The start time of the process since unix epoch in seconds
Shown as request
gitlab_runner.process_virtual_memory_bytes
(gauge)
The virtual memory size in bytes
Shown as byte

Events

The Gitlab Runner check does not include any events.

Service Checks

The Gitlab Runner check provides a service check to ensure that the Runner can talk to the Gitlab master and another one to ensure that the local Prometheus endpoint is available.

Troubleshooting

Need help? Contact Datadog support.