Description
A chat with @poettering (feel free to correct me if I remembered something wrong here) at FOSDEM 2025 brought up an interesting consideration: as he argues in favor of immutable image based operating environments, whose life cycle is effectively managed by another immutable OS image living as initrd (and probably systemd at the heart of it all), the currently typical NUT shutdown integration would not be feasible/welcome there in the way it is done now...
Roughly speaking, what we do currently on numerous platforms is:
- A NUT server (with physical connection to the UPS) normally runs drivers and the data server, probably also an
upsmon
instance inprimary
role (whoever is actually fed by that UPS should have a copy; most of the time the system managing an UPS is also fed by it). - If a power outage occurs, this server raises FSD (Forced Shut Down) state to tell everyone else to shut down, and after a while, a locally running
upsmon
shuts down its own operating system too. - As part of such FSD handling, the
upsmon
creates a file in location specified byPOWERDOWNFLAG
configuration from itsupsmon.conf
. - Running daemons, including NUT drivers for the UPS and the data server, are stopped as are any other services.
- (Some systems eventually kill all userland processes and remount read-only)
- In case of systemd-driven systems, the late shutdown hook script in
/usr/lib/systemd/system-shutdown/nutshutdown
kicks in, finds thatPOWERDOWNFLAG
file, and runs the NUT driver program again to tell the UPS to power-off/power-cycle (so the UPS usually turns on automatically when the wall power returns - or if it already has, and all fed systems are guaranteed to fully restart) at/after the moment we know the power loss would not corrupt any data.
This last bit actually places a number of constraints on the environment:
- The
POWERDOWNFLAG
file location should still be mounted and at least readable; - The NUT configuration files (e.g.
/etc/nut
or/etc/ups
) should be on filesystems still mounted and at least readable, so that the correct driver is chosen and connects to the expected device(s); - The NUT driver programs and any libraries they might dynamically link to (and possibly their resource files - maybe SNMP MIBs, etc.) should be on filesystems still mounted and at least readable (programs also executable).
- NOTE: In recent NUT releases, there is a way to tell the running driver program to turn the UPS off, instead of re-initializing the connection (can take long, a PITA in case of SNMP walks specifically); but there is no practical use for that to my knowledge. The
drivername -k
handling to kill power automatically tries to talk to an existing daemonized copy first (if found), before taking the matter into its own hands. But the systems/frameworks that indiscriminately kill off userland processes are unlikely to benefit from this anyway, unless they support some method of exempting certain programs from a killing spree.
These constraints go a bit against the goal that the image-based OSes want the operating environment fully unmounted, not even leaving read-only tentacles in place.
One feasible idea from the chat was to have NUT driver package(s) installed (also) into the initrd image, automatically pulling whatever dependencies are needed for the libraries it uses. Maybe a user-curated selection of drivers, maybe a vendor/corporation dictated "everything" (for signed images to be ubiquitously useful). Also a few tools like upsmon
(to check with upsmon -K
that FSD is in progress) and upsdrvctl
would also be needed.
And it would be that initrd image's shutdown hooks that tell the UPS to go off, after the production environment is safely unmounted and flushed.
- A location like
/run
(maybe it exactly?) might be used to convey not only thePOWERDOWNFLAG
file existence and magic content for that FSD handling to kick in, but also a copy of latest-known NUT configuration files. - In the
nutshutdown
script, theNUT_CONFPATH
could point to that copy;NUT_STATEPATH
,NUT_ALTPIDPATH
(any other?) envvars could be used to point to respective location usable in the initrd environment (e.g./dev/shm
for R/W locations, if at all used in shutdown routine - I think it would be a bug if PID files or socket files are created at that point; location existence may be checked though, not sure). - The driver programs called from
nutshutdown
could be told to just run asroot
and not drop privileges, as other accounts (andudev
rules in case of USB/Serial links) would likely not be configured at that point. Or maybe they would be there, if packaging did work all the way in initrd too. - Not sure about access to networked power devices (SNMP, NetXML, remote IPMI...) - if the system would still have an IP address at that point, or if it goes away when
network{,ing}.service
gets stopped. In legacy systems, the age where the late shutdown originated, an address stayed assigned until the OS power-cycled itself, so ansnmp-ups
could be commanded to power-cycle just as well...