-
Notifications
You must be signed in to change notification settings - Fork 150
fix: create secrets before sysusers, chown after #353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
modules/age.nix
Outdated
| systemd.services.agenix-chown = mkIf sysusersEnabled { | ||
| wantedBy = [ "sysinit.target" ]; | ||
| # Change ownership and group after users and groups are made. | ||
| after = [ "systemd-sysusers.service" ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If i understand correctly; shouldn't this be after = [ "agenix-install-secrets.service" ];?
I did not understand correctly
what you have is correct 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...actually I think you have a point :)
I think what I have is correct as long as systemd-sysusers.service is actually enabled. Since we're currently in a mkIf sysusersEnabled block, that's probably fine.
But if these units are ever used without systemd-sysusers.service, the before/after settings will have no effect. In that case I'd expect systemd to run agenix-chown in parallel with agenix-install-secrets, which... would not go over well.
So I'll add a commit to explicitly make agenix-chown go "after" agenix-install-secrets.
|
Thanks @marienz, I thought i was going insane 😅 |
bbcc1b4 to
9ea8e6f
Compare
|
Bah, I missed another problem (fixed with the commit I just added): we need to restart the new chown service to chown new agenix generations... Fix seems to work but may not be the best (a systemd expert I am not). |
This fixes referencing an agenix secret from a user hashedPasswordFile.
nixpkgs makes systemd-sysusers.service an alias for userborn.service.
With all of agenix-install-secrets.service running after that, userborn
fails to read hashedPasswordFile if that is an agenix secret.
Fix this by taking the same approach the activation script took: chown
secrets after systemd-sysusers, install them before.
The chown script depends on _agenix_generation being set, which it still
is from newGeneration when all scripts are part of the same activation
script but is not when running as a separate service, so break that out.
End the systemd scripts in "true". This is necessary for agenix-install,
because it otherwise ends in
```
(( _agenix_generation > 1 )) && {
echo "[agenix] removing old secrets (generation $(( _agenix_generation - 1 )))..."
rm -rf "/run/agenix.d/$(( _agenix_generation - 1 ))"
}
```
...which leaves the exit status of the last command non-zero if this is
the first generation. This has no effect for snippets included in the
Nix activation script, but if it's the end of one of our systemd units
it makes it fail.
It is currently not necessary for the chown service, but since it is
easy to introduce a similar problem unintentionally, end that in "true"
too.
Fixes ryantm#345
...after agenix-install-secrets, even if systemd-sysusers.service is disabled. Currently we should only use these units if systemd-sysusers.service is used (directly or by having userborn aliased to it). But better safe than sorry: with no ordering constraint systemd may run both in parallel...
If agenix-install-secrets.service is restarted, it will otherwise install a new generation whose secrets are not chown'd. (It would be better to chown them before the /run/agenix symlink is updated, but I'm not sure how to do that cleanly, and this should be equivalent to what the non-systemd scripts do.)
|
I don't know why CI failed there: the force push was a noop rebase (to pick up the checkin and revert of decrypt-parallel, just to make the status of this PR more green), it passed before, it still passes locally, and CI is not logging enough to diagnose (just "last 10 log lines" of a VM test that may have gotten stuck on something). I've been running this locally and not noticed any issues. |
This fixes the issue described in #345: using an Agenix secret as
hashedPasswordFilewith userborn enabled does not work.Tested locally.
Please let me know if you prefer a different approach for
installSecretscausing a non-zero exit status (the added "true" is admittedly a hack but it's pretty easy to reintroduce the bug otherwise...)