Open
Conversation
Add a background goroutine that continuously reaps any children that get re-parented to the crossplane-opentofu-provider process. Signed-off-by: Sebastian Trebitz <sebastian@nephosolutions.com>
0267109 to
b946474
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The zombie processes are created because crossplane-opentofu-provider (PID 1) spawns child processes (git, called internally by
tofu init -from-module=<git url>) but never callsWait()on them after they exit.When a process exits in Linux, it stays in a defunct (zombie) state until its parent calls wait() / waitpid() to reap it. Normally
init(PID 1) orsystemddoes this automatically for any orphaned children re-parented to PID 1. Sincecrossplane-opentofu-provideris PID 1 in the container but has no zombie-reaping logic, every finished child process that gets re-parented to it (e.g. git sub-processes spawned by tofu) stays as a zombie forever.Description of your changes
The fix is to make
crossplane-opentofu-providera subreaper usingprctl(PR_SET_CHILD_SUBREAPER, 1)on Linux, and then periodically (or in a background goroutine) callsyscall.Wait4(-1, ...)to reap any zombie children that have been re-parented to it.Fixes #74
I have:
make reviewableto ensure this PR is ready for review.How has this code been tested
TestReapAllNoChildrenreaper_test.goreapAllreturns immediately (doesn't block) when there are no children ? theWNOHANG+ exit conditionpid <= 0TestReapAllDoesNotPanicreaper_test.goreapAllwith no children never panicTestSetSubreaperreaper_linux_test.goprctl(PR_SET_CHILD_SUBREAPER, 1)succeeds without errorTestSetSubreaperIdempotentreaper_linux_test.gosetSubreapermultiple times is safe (kernel allows re-setting the flag)TestReapAllReapsDirectChildreaper_linux_test.go/proc/<pid>/statusstateZ) is removed from the process table afterreapAllTestReapAllReapsMultipleChildrenreaper_linux_test.goreapAllloop drains all pending zombies in one invocationTestReapAllAfterSIGKILLreaper_linux_test.goTestStartReapsChildAfterExitreaper_linux_test.goStart()path ? SIGCHLD handler + background goroutine ? automatically reaps a child without any explicitWaitcallThe cross-platform tests carry no build tag so they run everywhere. The Linux integration tests are gated with
//go:build linuxbecause they depend on/proc,SIGCHLDbehaviour, andWait4semantics that only exist on Linux; which is the only platform where the container runs anyway.