Skip to content

Add FEATURES="observability" for emerge status reporting#1595

Open
mattst88 wants to merge 6 commits into
gentoo:masterfrom
mattst88:observability
Open

Add FEATURES="observability" for emerge status reporting#1595
mattst88 wants to merge 6 commits into
gentoo:masterfrom
mattst88:observability

Conversation

@mattst88

Copy link
Copy Markdown
Contributor

I've been playing with writing an external portage monitoring utility. This PR adds FEATURES="observability" that lets external tools inspect a running emerge process without instrumenting emerge itself.

When enabled, the Scheduler publishes a machine-readable JSON snapshot of its current state to /run/portage/emerge-<pid>.json on every job state change, listing packages being built or merged, their current ebuild phase, elapsed time, and packages waiting to merge. The file is removed when emerge exits.

A Unix-domain socket at /run/portage/emerge-<pid>.sock streams JSON snapshots to connected clients: the current snapshot on connect, then one line per update.

Two new query interfaces are added:

  • emerge --status: human-readable table of active builds across all running emerge processes
  • portageq jobs [--json]: same data in machine-readable form

mattst88 added 5 commits June 21, 2026 00:06
When FEATURES="observability" is set, publish a machine-readable JSON
snapshot of the running emerge process to /run/portage/emerge-<pid>.json.
The snapshot lists each package currently building or merging, its current
ebuild phase, elapsed time, and PID, along with aggregate job counts.

The file is written on every job state change (rate-limited to at most once
per second) and removed when emerge exits. If the runtime directory is not
writable (unprivileged emerge, no /run mount), the feature disables itself
silently so builds are never interrupted.

Adds PORTAGE_RUN_PATH (/run/portage) to portage.const as the fixed
rendezvous point. Using a fixed path (rather than PORTAGE_TMPDIR) allows
any user to discover running emerge instances and matches FHS semantics for
runtime state.
Wire the ObservabilityMonitor into the Scheduler. Construct it alongside
the JobStatusDisplay and update it from the same sites that mutate the
status line: tasks entering _running_tasks in _schedule_tasks_imp, and
_build_exit/_merge_exit. Record a per-task start time on entry and remove
the status file on teardown in merge()'s finally block.

Track the live ebuild phase via a new notifyPhase callback on the
scheduler interface: EbuildPhase invokes it on each phase start (guarded,
since the standalone ebuild(1) command runs phases without it). This
keeps the snapshot's phase field current.

All monitor methods are no-ops unless FEATURES="observability" is set.
Add read-side commands that consume the observability status files.
"portageq jobs" globs /run/portage/emerge-*.json, skips stale (dead-PID)
files, and prints a table (or raw snapshots with --json). "emerge
--status" is a thin text wrapper for discoverability. Both honor
EPREFIX via portage.const.EPREFIX.
A package that has finished building but is waiting to be installed
(FEATURES="merge-wait", or a system package that forces merge-wait) sits
in the Scheduler's merge-wait queue. Mark such tasks in the snapshot with
merge_wait=true and surface "merge-wait" as their phase. Track each
build's start/finish time across the build->merge hand-off so a waiting
package freezes its elapsed time at build completion instead of letting
the wait inflate it.
Describe the feature in make.conf.5, the new "emerge --status" action in
emerge.1, and add a NEWS entry. ("portageq jobs" is self-documenting via
its docstring.)
Alongside the status file, expose a Unix-domain socket at
/run/portage/emerge-<pid>.sock that streams newline-delimited JSON
snapshots to connected clients. This allows external monitoring programs
to receive live emerge progress without polling the status file.

Each client receives the current snapshot immediately on connect, then
one snapshot per state change. Multiple clients may connect
simultaneously. The socket and file share the same snapshot, so no
additional scheduler overhead is incurred beyond what the file already
requires.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant