Fix foreman.target start and stop ordering#412
Merged
evgeni merged 2 commits intotheforeman:masterfrom Mar 18, 2026
Merged
Conversation
foreman.service was the only application service without After=/Wants= for redis and postgresql, unlike dynflow-sidekiq, candlepin, and all pulp services which already declared them. Without these, on a foreman.target stop+start, foreman could race against postgresql coming back up, exceed the sdnotify timeout, and be marked failed by systemd — leaving port 3000 unbound and HTTPD returning 503.
Add After=foreman.target to every service and timer that is PartOf=foreman.target. The reversed stop ordering (each service stops before foreman.target) means systemctl stop foreman.target now blocks until all constituent services have fully stopped, rather than returning the moment the target meta-unit transitions to inactive. Without this, a rapid stop+start of foreman.target can call systemctl start while services are still in the process of stopping, leading to container conflicts and races on the subsequent start.
Contributor
Author
|
This issue was discovered while testing #409. |
ehelms
approved these changes
Mar 18, 2026
evgeni
approved these changes
Mar 18, 2026
Member
evgeni
left a comment
There was a problem hiding this comment.
Took me a moment to grasp the After=foreman.target part, but I think it makes sense.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two related fixes for race conditions in
foreman.targetlifecycle management.Add startup ordering dependencies to foreman container
foreman.servicewas the only application service withoutAfter=/Wants=forredisandpostgresql, unlikedynflow-sidekiq,candlepin, and all Pulp services which already declared them. Without these, on aforeman.targetstop+start, foreman can start racing against postgresql coming back up, exceed the sdnotify timeout, and be marked failed by systemd — leaving port 3000 unbound and HTTPD returning 503.Ensure all foreman.target services stop before the target
Add
After=foreman.targetto every service and timer that isPartOf=foreman.target. Due to systemd's reversed stop ordering, each service now stops before the target, meaningsystemctl stop foreman.targetblocks until all constituent services have fully stopped rather than returning the moment the target meta-unit transitions to inactive. Without this, a rapid stop+start can callsystemctl startwhile services are still shutting down, causing container conflicts and races on the subsequent start.Before (from journal of a failing run):
After (from journal, two independent runs):