Skip to content

Restore CPU usage metrics; add memory usage metrics; fix various small issues with IP accounting and logs#184

Draft
SeanGeb wants to merge 11 commits into
prometheus-community:mainfrom
SeanGeb:export-resource-metrics
Draft

Restore CPU usage metrics; add memory usage metrics; fix various small issues with IP accounting and logs#184
SeanGeb wants to merge 11 commits into
prometheus-community:mainfrom
SeanGeb:export-resource-metrics

Conversation

@SeanGeb
Copy link
Copy Markdown

@SeanGeb SeanGeb commented Nov 30, 2025

Hello!

This is a quick PR to make some quick quality of life improvements to systemd_exporter.

The highlight is that I've restored support for CPU usage metrics. These were previously removed as they queried the cgroup filesystem directly, which trod on the toes of cgroup exporter.

systemd actually provides this information via the CPUUsageNSec= property of active units, so we can bypass any cgroup shenanigans and just get it straight from systemd; this should also smooth over any cgroup v1 vs v2 differences.

In the same vein I've also added equivalent metrics for memory usage.

Other smaller fixes include:

  • Grab and provide resource accounting metrics for scope and slice units.
    • scope units usually include processes started as part of a terminal session, so are just as valid to grab these metrics from as service units.
    • slice units are used to aggregate resource accounting metrics from multiple services, scopes, or other slices (and to apply additional limits, reservations, or weights to those resources). Again, we should also scrape and export these metrics; this immediately provides sysadmins with insights like the share of CPU time being used by system processes vs user processes.
  • Export the systemd version field in the metrics, alongside some other systemd metadata.
  • Add a flag for automatic enablement of certain collector features that require a new-enough systemd version.
  • Fixes to the log message format, where the messages incorrectly assume sprintf formatting.
  • Apply consistent casing of the term systemd to match what the systemd project themselves use.

This PR is not quite ready yet - I still intend to add some additional CPU and memory metrics to reflect any assigned reservations, quotas, or limits for those resources, which can help diagnose throttling or be used for alerts when nearing a hard limit - but please feel free to leave early feedback.

Related issues

Resolves:

Partially resolves:

May mitigate:

- systemd consistently uses lowercase [1]; we should follow this.
- Apply small code tidies and cleanups that help following commits.

[1]: https://systemd.io/

Signed-off-by: Sean Gebbett <10674942+SeanGeb@users.noreply.github.com>
Example output:

 # HELP systemd_meta Static systemd metadata
 # TYPE systemd_meta gauge
 systemd_meta{full_version="257.10-1.fc42"} 1

Signed-off-by: Sean Gebbett <10674942+SeanGeb@users.noreply.github.com>
We can easily get the systemd version from its DBus API and use that to
automatically enable certain metrics that otherwise require the user to
manually check their systemd version and enable the corresponding flag.

To avoid breakages in unusual situations - e.g. where somehow support
for those metrics has been patched or compiled out - make this behaviour
opt-in to start.

Signed-off-by: Sean Gebbett <10674942+SeanGeb@users.noreply.github.com>
Example output:

 # HELP systemd_meta Static systemd metadata
 # TYPE systemd_meta gauge
 systemd_meta{architecture="x86-64",full_version="257.10-1.fc42",virtualization="wsl"} 1

Signed-off-by: Sean Gebbett <10674942+SeanGeb@users.noreply.github.com>
Some log messages contain sprintf formatting directives but are only
passed as the message argument to a slog instance; remove those
directives and avoid complaining about the same error multiple times.

Signed-off-by: Sean Gebbett <10674942+SeanGeb@users.noreply.github.com>
For each of these units there's nothing too interesting to collect:

- automount units have some metadata, mainly Where=.
- path units have some metadata, mainly Unit=, Paths=, MakeDirectory=.
- target units don't provide any functionality of their own and have no
  unit-type-specific metadata.

Therefore there's no need to log the fact there's no handler for these
units - they're well-known and uninteresting from a metrics standpoint
so it's reasonable to simply ignore them.

Signed-off-by: Sean Gebbett <10674942+SeanGeb@users.noreply.github.com>
Scope units are a close correspondence to services for e.g. shell
sessions, so can also be units of resource accounting and control;
similarly, slices (e.g. system.slice) are units of resource accounting
and control that aggregate accounting and apply limits against the
aggregate of their child units (which can be slices, scopes, and
services).

By default, templated units are spawned under a scope named after the
unit's (non-templated) prefix, so immediately we'll start collecting
resource consumption metrics for all instances of templated units (e.g.
capsule@.service, systemd-journald@.service, modprobe@.service).

This also has an immediate benefit on systems used interactively: user-
spawned processes (e.g. from an SSH session) are put under user.slice
by default, while system processes (e.g. the SSH server itself) are put
under system.slice, and VMs or containers under machine.slice; this
means an admin could now use systemd_exporter metrics to tell where CPU
time or memory are being split between users, system services, and
system containers/VMs.

Signed-off-by: Sean Gebbett <10674942+SeanGeb@users.noreply.github.com>
If IP accounting isn't enabled for a unit, systemd returns the max
uint64 value over DBus (i.e. -1 casted to uint64). When this happens we
shouldn't export the metric; this is consistent with other metrics like
those provided by tasks accounting.

Signed-off-by: Sean Gebbett <10674942+SeanGeb@users.noreply.github.com>
Also makes the labels on the cpu_seconds_total and IP accounting metrics
consistent with other metrics defined against multiple unit types.

Example output:

 # HELP systemd_unit_cpu_seconds_total Unit CPU time in seconds
 # TYPE systemd_unit_cpu_seconds_total counter
 systemd_unit_cpu_seconds_total{name="-.slice",type="Slice"} 2719.072
 systemd_unit_cpu_seconds_total{name="NetworkManager-wait-online.service",type="Service"} 0.011585
 systemd_unit_cpu_seconds_total{name="NetworkManager.service",type="Service"} 0.183858
 systemd_unit_cpu_seconds_total{name="clickhouse-server.service",type="Service"} 875.48686
 systemd_unit_cpu_seconds_total{name="console-getty.service",type="Service"} 0.005837
 systemd_unit_cpu_seconds_total{name="dbus-broker.service",type="Service"} 1.652653
 systemd_unit_cpu_seconds_total{name="dnf-makecache.service",type="Service"} 0.621209
 systemd_unit_cpu_seconds_total{name="getty@tty1.service",type="Service"} 0.005612
 systemd_unit_cpu_seconds_total{name="init.scope",type="Scope"} 1393.11924
 systemd_unit_cpu_seconds_total{name="kmod-static-nodes.service",type="Service"} 0.003782
 # HELP systemd_service_ip_egress_bytes Service unit egress IP accounting in bytes.
 # TYPE systemd_service_ip_egress_bytes counter
 systemd_service_ip_egress_bytes{name="clickhouse-server.service",type="Service"} 281676

Signed-off-by: Sean Gebbett <10674942+SeanGeb@users.noreply.github.com>
This is immensely handy to debug e.g. services with a memory leak, or
unexpected OOM kills on an optimised, resource-allocated, multi-workload
system (for an example of why this can happen, see Facebook's extensive
documentation on optimising resource usage with cgroups [1]).

Example output:

 # HELP systemd_unit_memory_current_bytes Current memory usage in bytes.
 # TYPE systemd_unit_memory_current_bytes gauge
 systemd_unit_memory_current_bytes{name="NetworkManager.service",type="Service"} 7.737344e+06
 systemd_unit_memory_current_bytes{name="clickhouse-server.service",type="Service"} 2.003456e+09
 systemd_unit_memory_current_bytes{name="console-getty.service",type="Service"} 421888
 # HELP systemd_unit_memory_peak_bytes Peak memory usage in bytes.
 # TYPE systemd_unit_memory_peak_bytes gauge
 systemd_unit_memory_peak_bytes{name="NetworkManager-wait-online.service",type="Service"} 2.625536e+06
 systemd_unit_memory_peak_bytes{name="NetworkManager.service",type="Service"} 1.8141184e+07
 systemd_unit_memory_peak_bytes{name="clickhouse-server.service",type="Service"} 2.224668672e+09
 # HELP systemd_unit_swap_current_bytes Current swap usage in bytes.
 # TYPE systemd_unit_swap_current_bytes gauge
 systemd_unit_swap_current_bytes{name="NetworkManager.service",type="Service"} 0
 systemd_unit_swap_current_bytes{name="clickhouse-server.service",type="Service"} 1.94056192e+08
 systemd_unit_swap_current_bytes{name="console-getty.service",type="Service"} 0
 # HELP systemd_unit_swap_peak_bytes Peak swap usage in bytes.
 # TYPE systemd_unit_swap_peak_bytes gauge
 systemd_unit_swap_peak_bytes{name="NetworkManager-wait-online.service",type="Service"} 0
 systemd_unit_swap_peak_bytes{name="NetworkManager.service",type="Service"} 0
 systemd_unit_swap_peak_bytes{name="clickhouse-server.service",type="Service"} 2.51211776e+08

[1]: https://facebookmicrosites.github.io/cgroup2/docs/overview.html

Signed-off-by: Sean Gebbett <10674942+SeanGeb@users.noreply.github.com>
Signed-off-by: Sean Gebbett <10674942+SeanGeb@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant