erts: kill spawned child processes on VM exit (unix only) #9453

adamwight · 2025-02-17T23:09:38Z

This is a very rough proof-of-concept for discussion, which ensures all children spawned with open_port are terminated along with the BEAM.

Will be discussed in https://erlangforums.com/t/open-port-and-zombie-processes

CLAassistant · 2025-02-17T23:09:45Z

All committers have signed the CLA.

github-actions · 2025-02-17T23:10:26Z

CT Test Results

3 files 141 suites 50m 22s ⏱️
1 613 tests 1 563 ✅ 50 💤 0 ❌
2 331 runs 2 261 ✅ 70 💤 0 ❌

Results for commit 6f9f611.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

erts/emulator/sys/unix/erl_child_setup.c

garazdawi · 2025-02-20T08:17:00Z

Hello!

I think that we can move forward with this. There is no need to have an option to disable it for now (unless our existing tests shows that it is needed...), but there needs to be testcases to test that it works as expected on both Unix and Windows.

I do wonder however if we should send some other signal than KILL? Should we allow the child to be able to catch it and deal with it if they want to?

adamwight · 2025-02-20T08:44:46Z

I do wonder however if we should send some other signal than KILL? Should we allow the child to be able to catch it and deal with it if they want to?

Good point, TIL that sigkill is untrappable. Looking at erlexec for precedents, its default behavior is to send a sigterm to the direct child process, wait a configurable 5 seconds, and then send sigkill.

I experimented a bit locally to see if sh would react to a sigterm by stopping its own children, and it does not—so in order for spawn commands to benefit as well as spawn_executable, I would stick with the choice to send a signal to the entire child process group, but send TERM to be more polite. I like that this also offers the descendant processes a second and more straightforward workaround to prevent the termination if needed.

adamwight · 2025-02-23T13:55:47Z

Now includes a test for normal erl shutdown (halt() self), and I'll try to also write one for abnormal shutdown (receiving SIGKILL). There's a small amount of race condition remaining in the test which I'd like to clean up—perhaps by direct communication from the grandchild process (new test utility file_when_alive) back to the test executor, I'm open to suggestions about how to do that.

There are a few other flapping tests, I don't think this is related to my patch but can't say for sure...

Running the tests on Windows is not going well for me, and it seems there's no CI for that yet? In theory my patch and the test will also run on win32 but I'd like to see that happen.

garazdawi · 2025-02-24T08:34:39Z

perhaps by direct communication from the grandchild process (new test utility file_when_alive) back to the test executor, I'm open to suggestions about how to do that.

If the grandchild is an Erlang node, it could communicate via Erlang distribution? Otherwise a file seem reasonable.

Running the tests on Windows is not going well for me, and it seems there's no CI for that yet? In theory my patch and the test will also run on win32 but I'd like to see that happen.

No, there is no github CI for that yet. I have a branch that I work on from time to time to try to bring it in, but the tests are not stable enough yet. Maybe you can temporarily use it as a base for your changes and you should atleast be able to see if your tests pass or fail?

garazdawi · 2025-02-24T08:41:44Z

A couple of other things that popped into my mind:

What do we do when someone does port_close/1? To me it seems reasonable that the behaviour should be the same as if the Erlang VM terminated?

like that this also offers the descendant processes a second and more straightforward workaround to prevent the termination if needed.

I know that there are users that rely on being able to spawn daemon processes through os:cmd("command &"). This is tested in os_SUITE:background_command/1, but the test will not catch what happens when the emulator dies. I'm unsure what the user want to happen there, I can see them both wanting it to die and survive... though since the current behaviour is that it survives I think we need to keep that.

adamwight · 2025-02-26T16:56:28Z

Maybe you can temporarily use it as a base for your changes and you should atleast be able to see if your tests pass or fail?

Great! I'm working on that now, and learned that my proposed feature needs to be reimplemented separately for the win32 spawn driver.

What do we do when someone does port_close/1? To me it seems reasonable that the behaviour should be the same as if the Erlang VM terminated?

That makes sense, the same principle applies IMHO and it feels consistent to attempt a direct termination any time Erlang will lose its connection to the child. I'll add this.

os:cmd("command &").

I see... Interestingly, the "&" in that test is only relevant for allowing os:cmd to return immediately, the shell job control seems to be unimportant. In other words, the test is equivalent to calling open_port and not waiting for the process to finish, so this syntax is more a convenience than a special use case. But it would definitely indicate an intention to start a daemon with no direct link to Erlang, +1 that we should respect this usage!

As a tourist to the BEAM, all I can do is describe the options but I don't have instincts for which is the best way to go. We could preserve this "&" usage by only killing the immediate child process, which would be the shell. This still offers some benefits, since the application developer may be able to call open_port with spawn_executable, making their process the immediate child and causing it to be cleaned up without needing a wrapper script. It also simplifies reasoning about the "process group", killing exactly one child process is much more predictble.

michalmuskala · 2025-02-26T20:03:03Z

To some extent, I think there's a bigger problem here where we could have a whole new API for managing external processes - the current port API is neither powerful nor ergonomic.
This change to kill things proactively is definitely a good one, but if we're looking more into that, I think things could be improved dramatically.

As @garazdawi mentioned, today it's not even possible to easily kill the process once you spawn it, and port_close will just close the stdin.

adamwight · 2025-02-26T23:34:07Z

I found that shell "&" assigns the background job to a new process group, which IMHO means that killing children by process group is back on the table. For now however, my patch is rewritten to kill only the direct child process.

The latest branch also kills a port's child during port_close.

Splitting this responsibility between the main VM process and the forker is causing a memory leak (and a leaky abstraction), and I'm imagining this can be resolved by sending another protocol message to the forker to allow it to perform cleanup such as killing the process, then freeing memory used to track the child. Introducing this new message has some small overhead but I don't see any obvious, existing means for the forker to detect that the port was closed by beam.

we could have a whole new API for managing external processes

+1 that direct OS process management could be a nice addition to the core libraries, but the current iteration can be done without larger changes to the opaque port concept.

erts/emulator/sys/unix/erl_child_setup.c

garazdawi · 2025-02-27T11:36:58Z

Splitting this responsibility between the main VM process and the forker is causing a memory leak (and a leaky abstraction), and I'm imagining this can be resolved by sending another protocol message to the forker to allow it to perform cleanup such as killing the process, then freeing memory used to track the child. Introducing this new message has some small overhead but I don't see any obvious, existing means for the forker to detect that the port was closed by beam.

Yes, this seems like a good approach.

adamwight · 2025-03-06T07:59:58Z

What do we do when someone does port_close/1? To me it seems reasonable that the behaviour should be the same as if the Erlang VM terminated?

Although I think this is a good idea, when I made the change it turned out to be too aggressive and causes a lot of existing tests to fail (example job output). Looking at a simplified outline for one test, -suite port_SUITE -case output_only:

Port = open_port({spawn, "port_test -h0 -o outfile"}, [out]),
Port ! {self(), {command, "echodata123"}},
% race here?
Port ! {self(), close},
receive {Port, closed} -> ok end.

test_server:sleep(500)
{ok, Written} = file:read_file("outfile"),

The test finds that outfile was never written and the last match fails with {badmatch,{error,enoent}}. My theory is that the port close command immediately kills the spawned port_test OS process before it can begin its work. Adding another 500ms delay before closing the port at the "race here" comment indeed fixes the test, adding to my suspicion that my patch has created a race condition.

I'm open to suggestions about how to proceed! One could argue that the tests have always been risky, and relied on an undocumented behavior of the port to detect stdin closure, complete its process politely before closing and return closed. To fix the tests, port_test could be modified to eg. send a "done" message back to the test runner which would be used like so:

Port ! {self(), {command, "echodata123"}},
receive {Port, {data, "done."}} -> ok end.
Port ! {self(), close},
receive {Port, closed} -> ok end.

But it feels like application developers have probably been making the same assumption, and they have felt safe sending the close command without expecting dramatic and immediate side-effects? Are we now talking about a breaking change?

garazdawi · 2025-03-06T08:29:26Z

Good catch, I did not think of that. If we want that behaviour we can later add a port_kill (or something similar, I too have been thinking of doing a new API for external processes) that would not only close, but also kill the child process. So let's leave it at only killing processes when Erlang itself exits for now.

These tests demonstrate that the VM allows a spawned child OS process to outlive it. Current behavior relies on the child to "politely" watch stdin and exit when the file descriptor is close. Some but not all external processes follow this rule. The next few patches will change VM behavior to kill all child processes on exit, which should fix the tests introduced here.

This new message only has a minor effect in this patch: the forker stops tracking the child process on port_close, so if the child process is still running but stops later while the VM is alive, the forker will no longer send an exit status message to the closed port. The motivation for this patch is mostly just to set up the communication mechanism, to attach more interesting behavior later such as optionally killing the child process.

Previously, the forker start command would use the presence of port_id to indicate whether the caller had specified :exit_status to open_port. This no-op patch splits this out to an explicit second field `want_exit_status`, so that we can always track port_id. Motivation is to allow the os_pid->port_id mapping to be used regardless of the exit_status setting. This patch shouldn't cause any behavior changes.

If the VM goes down, the forker will now respond by killing all spawned children.

adamwight · 2025-03-06T23:17:07Z

Ready to review. This branch is supposed to:

Kill all child processes when the VM exits.
Independent of whether the exit was clean or a crash, and of whether flushing was disabled.
Only works on unix for now.

In follow-up work I might try to implement for win32, and will play with kill and kill_group options to open_port.

adamwight

Dumping my final thoughts, inline… This was fun!

adamwight · 2025-03-07T05:18:39Z

erts/emulator/test/port_SUITE_data/file_while_alive.c

+ */
+
+/*
+ * Test utility which waits for SIGTERM and responds by deleting a temp file


The file path is used because it's easy to verify externally, even when pipes to pass messages out of the process may have been rudely crashed.

adamwight · 2025-03-07T05:20:10Z

erts/emulator/test/port_SUITE_data/file_while_alive.c

+#include <errno.h>
+#include <fcntl.h>
+
+#ifndef __WIN32__


Doesn't run on win32 yet (tests are not executed), but the headers at least allow for successful compilation.

adamwight · 2025-03-07T05:27:00Z

erts/emulator/sys/unix/sys_drivers.c

+        memset(proto, 0, sizeof(ErtsSysForkerProto));
+        proto->action = ErtsSysForkerProtoAction_Stop;
+        proto->u.stop.os_pid = dd->pid;
+        erl_drv_port_control(forker_port, ERTS_FORKER_DRV_CONTROL_MAGIC_NUMBER,


We don't need to be picky about whether this message is successfully delivered, I think the main cases not covered by the errno below have to do with the forker going down during VM halt and this patch specifically adds logic to the forker to kill processes and clean up even when Erlang cannot send the final messages.

It wouldn't be helpful to also crash Erlang faster in these cases, better to let the general flushing happen if it was requested.

adamwight · 2025-03-07T05:29:25Z

erts/emulator/sys/unix/sys_drivers.c

 #endif
+    } else if (proto->action == ErtsSysForkerProtoAction_Stop) {
+        if ((res = write(forker_fd, (char*)proto, sizeof(*proto))) < 0) {
+            if (errno != EAGAIN && errno != EINTR) {


These are a bit of a random guess. I saw EAGAIN come up in some test cases and found other guard conditions allowing these errors to be non-fatal. That vaguely makes sense, because they have to do with the write syscall failing on an otherwise operational pipe.

It might be worthwhile to retry the write later? But the precedent I see elsewhere in the code seems to be more practical, this message is a nice-to-have at the moment.

adamwight · 2025-03-07T05:30:53Z

erts/emulator/sys/unix/erl_child_setup.c

@@ -591,10 +590,17 @@ main(int argc, char *argv[])
                errno = 0;

                os_pid = fork();
-                if (os_pid == 0)
+                if (os_pid == 0) {


Braces are added only to avoid confusion with the scoping block below.

adamwight · 2025-03-07T05:33:04Z

erts/emulator/sys/unix/erl_child_setup.c

-                   from the uds_fd. */
-                DEBUG_PRINT("Failed to write to uds: %d (%d)", uds_fd, errno);
+            est.os_pid = (pid_t)ibuff[0];
+            es = hash_remove(forker_hash, &est);


No change here—get_port_id already had the side effect of popping from the hash.

adamwight · 2025-03-07T05:34:45Z

erts/emulator/sys/unix/erl_child_setup.c

            res = read_all(sigchld_pipe[0], (char *)ibuff, sizeof(ibuff));
            if (res <= 0) {
                ABORT("Failed to read from sigchld pipe: %d (%d)", res, errno);
            }

-            proto.u.sigchld.port_id = get_port_id((pid_t)(ibuff[0]));
-
-            if (proto.u.sigchld.port_id == THE_NON_VALUE)


port_id is always present, now. es can still be missing from the hash, but only in edge cases like double-close or other bad situations.

Might even be worthwhile to emit a debug line if the child isn't found in forker_hash.

adamwight · 2025-03-07T05:40:36Z

erts/emulator/sys/unix/erl_child_setup.c

@@ -662,6 +665,21 @@ main(int argc, char *argv[])
    return 1;


Should be unreachable.

Actually, I'm uncertain about some of the ABORT exits: my thinking is that these indicate pathological corner cases where we can no longer trust the internal tracking nor child pipes, so it would be useless or even risky to try "atexit"-like cleanup behaviors. This is supported by documentation for glibc abort:

Up until glibc 2.26, if the abort() function caused process termination, all open streams were closed and flushed (as with fclose(3)). However, in some cases this could result in deadlocks and data corruption. Therefore, starting with glibc 2.27, abort() terminates the process without flushing streams. POSIX.1 permits either possible behavior, saying that abort() "may include an attempt to effect fclose() on all open streams".

That's how I feel, too.

adamwight · 2025-03-07T05:41:35Z

erts/emulator/sys/unix/erl_child_setup.c

@@ -662,6 +665,21 @@ main(int argc, char *argv[])
    return 1;
 }

+static void kill_child(pid_t os_pid) {
+    if (os_pid > 0 && kill(os_pid, SIGTERM) != 0) {
+        DEBUG_PRINT("error killing process %d: %d", os_pid, errno);


Should always work, but if it doesn't because of eg. a race condition between the child dying on its own and killing it, continue with trying to kill the other children.

adamwight · 2025-03-07T05:44:01Z

erts/preloaded/src/erlang.erl

@@ -7475,6 +7475,11 @@ reported to the owning process using signals of the form

 The maximum number of ports that can be open at the same time can be configured
 by passing command-line flag [`+Q`](erl_cmd.md#max_ports) to [erl](erl_cmd.md).
+
+When the VM shuts down, spawned executables are sent `SIGTERM` on unix. The


"unix" -> "POSIX"? Hopefully this can be reimplemented for win32 in later work, anyway. Is it okay to have divergent behavior on the platforms?

mikpe · 2025-03-07T13:08:25Z

I haven't read this PR in detail, but wanted to suggest sending SIGHUP instead of SIGTERM. Both terminate the process unless caught, but SIGHUP is a better match IMO for this scenario.

adamwight · 2025-03-07T15:12:44Z

suggest sending SIGHUP instead of SIGTERM. Both terminate the process unless caught, but SIGHUP is a better match IMO for this scenario.

I could imagine this to be true, and HUP has the historical advantage of having nearly unchanged semantics as in the previous assumption: that polite apps quit when their stdin is closed. Hanging up is already nearly identical to closing pipes, see nohup.

On the other hand, many existing daemons trap HUP to gracefully reload. And apps robust against stdin closure might also ignore HUP.

If I had to personify the two signals, my feeling is that HUP would be a vague "goodbye!" while TERM is an explicit but amiable "please terminate yourself now." How to make such a decision! If you were on a desert island with only one unix signal…

garazdawi · 2025-03-10T19:35:54Z

erts/emulator/sys/unix/erl_child_setup.c

+            } else if (proto.action == ErtsSysForkerProtoAction_Stop) {
+                ErtsSysExitStatus est, *es;
+                est.os_pid = proto.u.stop.os_pid;
+                es = hash_remove(forker_hash, &est);


Why do we remove it from the hash here? Do we not want closed but alive processes to be killed when the VM shuts down?

We no longer want to receive an exit message from the spawned process, and I was getting conservative about memory, but I now see that we can keep both behaviors by changing the hash_remove to:
es.want_exit_status = false

(Also, my concern about memory is nonsense since the handful of bytes in es is tracking an entire, in-memory child process.)

garazdawi · 2025-03-10T19:50:36Z

Thanks for making this, I think the code looks fine except one comment that I added.

As this PR is a potentially breaking change, it will be part of the next major release, that is Erlang/OTP 29.

Only works on unix for now.

Before merging this we need to atleast investigate whether it is possible to do something similar on Windows or not. We want the behaviour of the different OSs to be as similar as possible, however this particular area is already full with platform specific stuff, so it is not super important.

* Child erl must be run with a shell if it should die on EOF. * Remove unexpected exit status matches. These will be caught by the timeout anyway, and extra cases were muddying the intention. * Use ct:sleep when possible. Still relies on manual sleeping and temporary files, which should be replaced by distribution in later clean-up.

port_close has its normal effects and no exit message will be received even if exit_status was requested on the initial open_port, but now these children will still be killed when the VM halts. Includes a test for this case.

adamwight · 2025-03-10T21:56:36Z

Before merging this we need to atleast investigate whether it is possible to do something similar on Windows or not.

For sure. I've installed the win32 development environment and can compile a release, but still a bit stuck on running tests. It would be possible to implement the feature using manual smoke testing on the command line in a pinch. I've also been waiting for the unix behavior to solidify so that I'm not working in windows any more than necessary—I feel comfortable going ahead with that work now.

adamwight · 2025-03-14T08:40:29Z

My last MS programming was a DOS graphics library, so please take my findings with a grain of salt. Here's what I learned from RTFM:

TerminateProcess should be avoided, this is like _exit in unix, it cannot be blocked, no polite cleanup can be done, and global state of shared DLLs may be corrupted.

The nice way to shut down a process is to use ExitProcess, but this can only be called by the process itself. The typical, recommended pattern is that a custom process and its controlling process will both call RegisterWindowMessageA to reserve an opaque message ID based on a shared string constant or by passing the ID directly; the controlling process will send the message using BroadcastSystemMessage and then the controlled process will respond by calling ExitProcess on itself. I don't see how to apply this to our case however, because the child is a black box and there's no way to teach an arbitrary application to listen for this custom message.

Then we get into murkier territory that I don't quite understand: I think that a process can open one or more windows. I don't know whether the main process itself counts as a window even if it's "hidden" but I think it does. A process is made of up one or more threads. There are a variety of library methods to send a message to a window or a thread, and in start_erl we have a precedent for calling PostThreadMessage on the thread created by CreateProcessW, but this relies on our own custom process listening for WM_USER. For a generic process, I think we want to send a WM_CLOSE message to the main thread.

We can start by tracking spawned processes and sending WM_CLOSE from the VM during halt, and then as follow-up work the whole arrangement could be improved by perhaps creating a forker process like erl_child_setup on unix, and letting it do the cleanup if the VM crashes. Maybe the forker can be pulled up a level so that most of its logic is shared between win32 and unix.

adamwight commented Feb 18, 2025

View reviewed changes

erts/emulator/sys/unix/erl_child_setup.c Outdated Show resolved Hide resolved

adamwight force-pushed the aw-orphans branch 5 times, most recently from 9f87bc1 to dba896f Compare February 20, 2025 07:45

garazdawi self-assigned this Feb 20, 2025

garazdawi added the team:VM Assigned to OTP team VM label Feb 20, 2025

adamwight force-pushed the aw-orphans branch 4 times, most recently from 8efd4f5 to 81abb88 Compare February 22, 2025 16:47

adamwight force-pushed the aw-orphans branch from 81abb88 to 80f301a Compare February 24, 2025 00:35

adamwight force-pushed the aw-orphans branch from 80f301a to a9b6e1c Compare February 26, 2025 23:15

adamwight commented Feb 26, 2025

View reviewed changes

erts/emulator/sys/unix/erl_child_setup.c Show resolved Hide resolved

adamwight force-pushed the aw-orphans branch 2 times, most recently from f7eee72 to f7c336f Compare February 27, 2025 16:02

garazdawi added this to the OTP-29.0 milestone Mar 6, 2025

adamwight added 3 commits March 6, 2025 23:55

Document forker protocol

f0d4da0

adamwight changed the title ~~[draft] erts: kill spawned child processes on VM exit~~ erts: kill spawned child processes on VM exit (unix only) Mar 6, 2025

adamwight added 2 commits March 7, 2025 00:10

Kill child processes on VM exit (unix-only)

9c1b74a

If the VM goes down, the forker will now respond by killing all spawned children.

adamwight force-pushed the aw-orphans branch from f7c336f to 9c1b74a Compare March 6, 2025 23:11

adamwight commented Mar 7, 2025

View reviewed changes

garazdawi reviewed Mar 10, 2025

View reviewed changes

adamwight added 2 commits March 10, 2025 22:43

Still track spawned child processes after port_close

6f9f611

port_close has its normal effects and no exit message will be received even if exit_status was requested on the initial open_port, but now these children will still be killed when the VM halts. Includes a test for this case.

samcamwilliams mentioned this pull request Apr 7, 2025

impr: kill child genesis-wasm executor on close of HyperBEAM permaweb/HyperBEAM#208

Merged

erts: kill spawned child processes on VM exit (unix only) #9453

Are you sure you want to change the base?

erts: kill spawned child processes on VM exit (unix only) #9453

Uh oh!

Conversation

adamwight commented Feb 17, 2025

Uh oh!

CLAassistant commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CT Test Results

Artifacts

Uh oh!

Uh oh!

garazdawi commented Feb 20, 2025

Uh oh!

adamwight commented Feb 20, 2025

Uh oh!

adamwight commented Feb 23, 2025

Uh oh!

garazdawi commented Feb 24, 2025

Uh oh!

garazdawi commented Feb 24, 2025

Uh oh!

adamwight commented Feb 26, 2025

Uh oh!

michalmuskala commented Feb 26, 2025

Uh oh!

adamwight commented Feb 26, 2025

Uh oh!

Uh oh!

garazdawi commented Feb 27, 2025

Uh oh!

adamwight commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

garazdawi commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamwight commented Mar 6, 2025

Uh oh!

adamwight left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mikpe commented Mar 7, 2025

Uh oh!

adamwight commented Mar 7, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

garazdawi commented Mar 10, 2025

Uh oh!

adamwight commented Mar 10, 2025

Uh oh!

adamwight commented Mar 14, 2025

Uh oh!

CLAassistant commented Feb 17, 2025 •

edited

Loading

github-actions bot commented Feb 17, 2025 •

edited

Loading

adamwight commented Mar 6, 2025 •

edited

Loading

garazdawi commented Mar 6, 2025 •

edited

Loading