Vespa

Supported OS

통합 버전1.1.0

개요

Vespa 시스템에서 실시간으로 메트릭을 가져와 다음을 실행합니다.

  • Vespa 상태 및 성능을 시각화하고 모니터링합니다.
  • 상태 및 가용성에 대한 알림을 받습니다.

설정

Datadog Agent 패키지에는 Vespa 점검이 포함되어 있지 않으므로 직접 설치해야 합니다.

설치

Agent v7.21 이상 / v6.21 이상이라면, 아래 지침을 따라 호스트에 Vespa 점검을 설치하세요. Docker Agent 또는 이전 버전의 Agent를 사용하여 설치하려면 커뮤니티 통합 사용을 참고하세요.

  1. 다음 명령어를 실행해 에이전트 통합을 설치하세요.

    datadog-agent integration install -t datadog-vespa==<INTEGRATION_VERSION>
    
  2. 통합을 코어 통합과 유사하게 설정하세요.

구성

Vespa 점검을 구성하는 방법:

  1. Agent 구성 디렉터리의 루트에 있는 conf.d/ 폴더에 vespa.d/ 폴더를 만듭니다.
  2. 이전에 만든 vespa.d/ 폴더에 conf.yaml 파일을 만듭니다.
  3. 샘플 vespa.d/conf.yaml 파일을 참고하여 그 내용을 conf.yaml 파일에 복사합니다.
  4. conf.yaml 파일을 편집하여 consumer를 구성합니다. 이 구성은 점검에서 전달할 메트릭 세트를 결정합니다.
    • consumer: Vespa 애플리케이션 services.xml에서 default 또는 커스텀 컨슈머에 대한 메트릭을 수집하는 컨슈머
  5. Agent를 재시작합니다.

검증

Agent의 상태 하위 명령을 실행하고 Checks 섹션에서 vespa를 찾습니다.

수집한 데이터

메트릭

vespa.http.status.1xx.rate
(gauge)
Number of responses with a 1xx status
Shown as response
vespa.http.status.2xx.rate
(gauge)
Number of responses with a 2xx status
Shown as response
vespa.http.status.3xx.rate
(gauge)
Number of responses with a 3xx status
Shown as response
vespa.http.status.4xx.rate
(gauge)
Number of responses with a 4xx status
Shown as response
vespa.http.status.5xx.rate
(gauge)
Number of responses with a 5xx status
Shown as response
vespa.jdisc.gc.ms.average
(gauge)
Time spent in GC
Shown as millisecond
vespa.mem.heap.free.average
(gauge)
Free heap size
Shown as byte
vespa.queries.rate
(gauge)
Number of search queries
Shown as query
vespa.feed.operations.rate
(gauge)
Number of feed operations
Shown as operation
vespa.query_latency.average
(gauge)
Total query processing time
Shown as millisecond
vespa.query_latency.95percentile
(gauge)
95 percentile total query processing time
Shown as millisecond
vespa.query_latency.99percentile
(gauge)
99 percentile total query processing time
Shown as millisecond
vespa.hits_per_query.average
(gauge)
Hits in the returned result, per query
Shown as hit
vespa.totalhits_per_query.average
(gauge)
Estimated total number of hits per query
Shown as hit
vespa.degraded_queries.rate
(gauge)
Queries with degraded results due to timeout
Shown as query
vespa.failed_queries.rate
(gauge)
Failed queries
Shown as query
vespa.serverActiveThreads.average
(gauge)
Threads that are active processing requests
Shown as thread
vespa.content.proton.search_protocol.docsum.requested_documents.rate
(gauge)
Requested document summaries
Shown as document
vespa.content.proton.search_protocol.docsum.latency.average
(gauge)
Docsum request latency on content node
Shown as second
vespa.content.proton.search_protocol.query.latency.average
(gauge)
Query request latency on content node
Shown as second
vespa.content.proton.documentdb.documents.total.last
(gauge)
Total documents in this document db (ready + not-ready)
Shown as document
vespa.content.proton.documentdb.documents.ready.last
(gauge)
Ready documents in this document db
Shown as document
vespa.content.proton.documentdb.documents.active.last
(gauge)
Active/searchable documents in this document db
Shown as document
vespa.content.proton.documentdb.disk_usage.last
(gauge)
Total disk usage for this document db
Shown as byte
vespa.content.proton.documentdb.memory_usage.allocated_bytes.last
(gauge)
Total memory usage for this document db
Shown as byte
vespa.content.proton.resource_usage.disk.average
(gauge)
Relative amount of disk space used by this process
Shown as fraction
vespa.content.proton.resource_usage.memory.average
(gauge)
Relative amount of memory used by this process
Shown as fraction
vespa.content.proton.resource_usage.feeding_blocked.last
(gauge)
Whether feeding is blocked due to resource limitations (value is 0 or 1)
vespa.content.proton.documentdb.matching.docs_matched.rate
(gauge)
Number of documents matched
Shown as document
vespa.content.proton.documentdb.matching.docs_reranked.rate
(gauge)
Number of documents re-ranked (second phase)
Shown as document
vespa.content.proton.documentdb.matching.rank_profile.query_latency.average
(gauge)
Total latency when matching and ranking a query
Shown as second
vespa.content.proton.documentdb.matching.rank_profile.query_setup_time.average
(gauge)
Average time spent setting up and tearing down queries
Shown as second
vespa.content.proton.documentdb.matching.rank_profile.rerank_time.average
(gauge)
Time spent on 2nd phase ranking
Shown as second
vespa.content.proton.transactionlog.disk_usage.last
(gauge)
Disk usage of the transaction log
Shown as byte

이벤트

Vespa 통합은 이벤트를 포함하지 않습니다.

서비스 점검

vespa.metrics_health

Returns CRITICAL if there is no response from the Vespa Node metrics API. Returns WARNING if there is a response from the Vespa Node metrics API but there was an error in processing, otherwise returns OK.

Statuses: ok, warning, critical

vespa.process_health

For each Vespa process, returns CRITICAL if the process seems to be down. Returns WARNING if the process status is unknown, otherwise returns OK.

Statuses: ok, warning, critical

트러블슈팅

도움이 필요하신가요? Datadog 고객 지원팀에 문의해주세요.