-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[draft] erts: kill spawned child processes on VM exit #9453
base: master
Are you sure you want to change the base?
Conversation
CT Test Results 3 files 141 suites 49m 58s ⏱️ Results for commit 80f301a. ♻️ This comment has been updated with latest results. To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts// Erlang/OTP Github Action Bot |
} | ||
|
||
static Eterm get_port_id(pid_t os_pid) | ||
{ | ||
ErtsSysExitStatus est, *es; | ||
Eterm port_id; | ||
est.os_pid = os_pid; | ||
es = hash_remove(forker_hash, &est); | ||
es = hash_get(forker_hash, &est); | ||
if (!es) return THE_NON_VALUE; | ||
port_id = es->port_id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to preserve the original behavior of only sending exit_status back to callers which have requested it, since port_id is set conditionally by the caller and is still used in the guard around sending—not simply whether the os_pid exists in the hash table.
9f87bc1
to
dba896f
Compare
Hello! I think that we can move forward with this. There is no need to have an option to disable it for now (unless our existing tests shows that it is needed...), but there needs to be testcases to test that it works as expected on both Unix and Windows. I do wonder however if we should send some other signal than |
Good point, TIL that sigkill is untrappable. Looking at I experimented a bit locally to see if |
8efd4f5
to
81abb88
Compare
Now includes a test for normal erl shutdown ( There are a few other flapping tests, I don't think this is related to my patch but can't say for sure... Running the tests on Windows is not going well for me, and it seems there's no CI for that yet? In theory my patch and the test will also run on win32 but I'd like to see that happen. |
TODO: Needs to be tested on win32
Keep the mapping of all living child processes so that we'll be able to iterate over them to clean up, rather than only storing the children which have an associated port.
If the uds_fd connection to the parent BEAM is broken or closed, react by killing all children and any descendants in the same process group. A concise demonstration of the problem being solved is to run this command with and without the patch, then kill the BEAM. Without the patch, the "sleep" process will continue: erl -noshell -eval 'os:cmd("sleep 60")' To intentionally start a child process which can outlive BEAM termination, give it a new process group for example by using `setsid`: erl -noshell -eval 'os:cmd("setsid sleep 60")' TODO: Needs to be tested on win32
81abb88
to
80f301a
Compare
If the grandchild is an Erlang node, it could communicate via Erlang distribution? Otherwise a file seem reasonable.
No, there is no github CI for that yet. I have a branch that I work on from time to time to try to bring it in, but the tests are not stable enough yet. Maybe you can temporarily use it as a base for your changes and you should atleast be able to see if your tests pass or fail? |
A couple of other things that popped into my mind: What do we do when someone does
I know that there are users that rely on being able to spawn daemon processes through |
This is a very rough proof-of-concept for discussion, which ensures all children spawned with open_port are terminated along with the BEAM.
Will be discussed in https://erlangforums.com/t/open-port-and-zombie-processes