fix: reap zombie child processes in container environments#894
Open
toller892 wants to merge 1 commit into
Open
fix: reap zombie child processes in container environments#894toller892 wants to merge 1 commit into
toller892 wants to merge 1 commit into
Conversation
When soft-serve runs as PID 1 in a container without an init supervisor, orphaned descendant processes (e.g. git pack-objects left behind when a git parent exits) are reparented to PID 1 and become zombies because the Go runtime only tracks children spawned via os/exec. Add a periodic reaper goroutine that calls waitpid(-1, WNOHANG) every 10 seconds to clean up any zombie children. The reaper runs only on Linux (where the PID 1 container issue manifests) and is a no-op on other platforms. Fixes charmbracelet#891
|
I would prefer to modify the Dockerfile to add |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When soft-serve runs as PID 1 in a container (e.g. Kubernetes) without an init supervisor like tini, orphaned descendant processes are reparented to PID 1 and become zombies. This happens because:
git pack-objects,git index-pack)os/exec, not reparented orphanswaitpid()calls, these processes accumulate as zombiesIn the reporter's Kubernetes environment, this caused 30k+ zombies in under 24 hours, leading to PID exhaustion and node failure.
Fix
Add a periodic zombie reaper goroutine that calls
waitpid(-1, WNOHANG)every 10 seconds to clean up any zombie children. The reaper:golang.org/x/sys/unix.Wait4(already a dependency)Changes
cmd/soft/serve/reap_linux.go— Linux implementation usingunix.Wait4cmd/soft/serve/reap_other.go— no-op stub for non-Linux platformscmd/soft/serve/serve.go— callreapZombies()during server startupTesting
Fixes #891