Systemd

Supported OS Linux

Overview

This check monitors Systemd and the units it manages through the Datadog Agent.

  • Track the state and health of your Systemd
  • Monitor the units, services, sockets managed by Systemd

Setup

Installation

The Systemd check is included in the Datadog Agent package. No additional installation is needed on your server.

Configuration

Host

To configure this check for an Agent running on a host:

  1. Edit the systemd.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your systemd performance data. See the sample systemd.d/conf.yaml for all available configuration options.

  2. Restart the Agent.

Containerized

For containerized environments, mount the /run/systemd/ folder, which contains the socket /run/systemd/private needed to retrieve the Systemd data, for example:

docker run -d -v /var/run/docker.sock:/var/run/docker.sock:ro \
              -v /proc/:/host/proc/:ro \
              -v /sys/fs/cgroup/:/host/sys/fs/cgroup/:ro \
              -v /run/systemd/:/host/run/systemd/:ro \
              -e DD_API_KEY=<YOUR_API_KEY> \
              datadog/agent:latest

Validation

Run the Agent’s status subcommand and look for systemd under the Checks section.

Data Collected

Metrics

systemd.service.cpu_time_consumed
(gauge)
The overall cpu consumed by the service in nanoseconds (CPUUsageNSec), requires Systemd configuration CPUAccounting to be enabled and Systemd version >= 220
Shown as nanosecond
systemd.service.memory_usage
(gauge)
The memory currently used by the service in bytes (MemoryCurrent), requires Systemd configuration MemoryAccounting to be enabled
Shown as byte
systemd.service.restart_count
(gauge)
The number of times the service has been restarted due to Restart= (NRestarts), requires Systemd version >= 235
Shown as time
systemd.service.task_count
(gauge)
The current number of tasks in the service (TasksCurrent), requires Systemd configuration TasksAccounting to be enabled
Shown as task
systemd.socket.connection_accepted_count
(gauge)
The number of accepted socket connections (NAccepted)
Shown as connection
systemd.socket.connection_count
(gauge)
The current number of socket connections (NConnections)
Shown as connection
systemd.socket.connection_refused_count
(gauge)
The total number of refused socket connections (NRefused), requires Systemd version >= 239
Shown as connection
systemd.unit.active
(gauge)
Whether the unit is currently in active state
systemd.unit.loaded
(gauge)
Whether the unit is currently in loaded state
systemd.unit.monitored
(gauge)
Indicates that the unit is monitored (the value is always 1)
systemd.unit.uptime
(gauge)
The unit uptime in seconds since it's activation
Shown as second
systemd.units_by_state
(gauge)
Sum by state to count units
Shown as unit
systemd.units_loaded_count
(gauge)
The number of loaded units
Shown as unit
systemd.units_monitored_count
(gauge)
The number of monitored units
Shown as unit
systemd.units_total
(gauge)
The total number of units
Shown as unit

Some metrics are reported only if the respective configuration are enabled:

  • systemd.service.cpu_time_consumed requires Systemd configuration CPUAccounting to be enabled
  • systemd.service.memory_usage requires Systemd configuration MemoryAccounting to be enabled
  • systemd.service.task_count requires Systemd configuration TasksAccounting to be enabled

Some metrics are only available from specific version of Systemd:

  • systemd.service.cpu_time_consumed requires Systemd v220
  • systemd.service.restart_count requires Systemd v235
  • systemd.socket.connection_refused_count requires Systemd v239

Events

The Systemd check does not include any events.

Service Checks

systemd.can_connect
Returns OK if Systemd is reachable, CRITICAL otherwise.
Statuses: ok, critical

systemd.system.state
Returns OK if Systemd’s system state is running. Returns CRITICAL if the state is degraded, maintenance, or stopping. Returns UNKNOWN if the state is initializing, starting, or other.
Statuses: ok, critical, unknown

systemd.unit.state
Returns OK if the unit active state is active. Returns CRITICAL if the state is inactive, deactivating, or failed. Returns UNKNOWN if the state is activating or other.
Statuses: ok, critical, unknown

systemd.unit.substate
Returns OK CRITICAL or UNKNOWN based on the substate of the unit and the user-provided mapping in systemd.d/conf.yaml.
Statuses: ok, critical, unknown

Troubleshooting

Need help? Contact Datadog support.