cert-manager

cert-manager

Agent Check Agent Check

Supported OS Linux Mac OS Windows

Integrationv2.0.0

Overview

This check collects metrics from cert-manager.

Cert-Manager Overview Dashboard

Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.

Installation

Agent versions >=7.26.0 or >=6.26.0

To use an integration from integrations-extra with the Docker Agent, Datadog recommends building the Agent with the integration installed. Use the following Dockerfile to build an updated version of the Agent that includes the cert_manager integration from integrations-extras:

FROM gcr.io/datadoghq/agent:latest
RUN agent integration install -r -t datadog-cert_manager==<INTEGRATION_VERSION>

Agent versions <7.26.0 or <6.26.0

To install the cert_manager check on your host:

  1. Install the developer toolkit.

  2. Clone the integrations-extras repository:

    git clone https://github.com/DataDog/integrations-extras.git.
    
  3. Update your ddev config with the integrations-extras/ path:

    ddev config set extras ./integrations-extras
    
  4. To build the cert_manager package, run:

    ddev -e release build cert_manager
    
  5. Download the Agent manifest to install the Datadog Agent as a DaemonSet.

  6. Create two PersistentVolumeClaims, one for the checks code, and one for the configuration.

  7. Add them as volumes to your Agent pod template and use them for your checks and configuration:

         env:
           - name: DD_CONFD_PATH
             value: "/confd"
           - name: DD_ADDITIONAL_CHECKSD
             value: "/checksd"
       [...]
         volumeMounts:
           - name: agent-code-storage
             mountPath: /checksd
           - name: agent-conf-storage
             mountPath: /confd
       [...]
       volumes:
         - name: agent-code-storage
           persistentVolumeClaim:
             claimName: agent-code-claim
         - name: agent-conf-storage
           persistentVolumeClaim:
             claimName: agent-conf-claim
    
  8. Deploy the Datadog Agent in your Kubernetes cluster:

    kubectl apply -f agent.yaml
    
  9. Copy the integration artifact .whl file to your Kubernetes nodes or upload it to a public URL

  10. Run the following command to install the integrations wheel with the Agent:

    kubectl exec $(kubectl get pods -l app=datadog-agent -o jsonpath='{.items[0].metadata.name}') -- agent integration install -w <PATH_OF_CERT_MANAGER_ARTIFACT_>/<CERT_MANAGER_ARTIFACT_NAME>.whl
    
  11. Run the following commands to copy the checks and configuration to the corresponding PVCs:

    kubectl exec $(kubectl get pods -l app=datadog-agent -o jsonpath='{.items[0].metadata.name}') -- cp -R /opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/* /checksd
    kubectl exec $(kubectl get pods -l app=datadog-agent -o jsonpath='{.items[0].metadata.name}') -- cp -R /etc/datadog-agent/conf.d/* /confd
    
  12. Restart the Datadog Agent pods.

Configuration

  1. Edit the cert_manager.d/conf.yaml file, in the /confd folder that you added to the Agent pod to start collecting your cert_manager performance data. See the sample cert_manager.d/conf.yaml for all available configuration options.

  2. Restart the Agent.

Validation

Run the Agent’s status subcommand and look for cert_manager under the Checks section.

Data Collected

Metrics

cert_manager.prometheus.health
(gauge)
Whether the check is able to connect to the metrics endpoint
cert_manager.certificate.ready_status
(gauge)
The ready status of the certificate
cert_manager.certificate.expiration_timestamp
(gauge)
The date after which the certificate expires. Expressed as a Unix Epoch Time
Shown as second
cert_manager.http_acme_client.request.count
(count)
The number of requests made by the ACME client
cert_manager.http_acme_client.request.duration.sum
(gauge)
The sum of the HTTP request latencies in seconds for the ACME client
cert_manager.http_acme_client.request.duration.count
(gauge)
The count of the HTTP request latencies in seconds for the ACME client
cert_manager.http_acme_client.request.duration.quantile
(gauge)
The quantiles of the HTTP request latencies in seconds for the ACME client
cert_manager.controller.sync_call.count
(count)
The number of sync() calls made by a controller

Events

cert_manager does not include any events.

Service Checks

cert_manager.prometheus.health
Returns CRITICAL if the agent fails to connect to the Prometheus endpoint, otherwise OK.
Statuses: ok, critical

Troubleshooting

Need help? Contact Datadog support.