[core] Avoid fork preexec for Ray Client specific servers#63408
[core] Avoid fork preexec for Ray Client specific servers#63408Jerry2003826 wants to merge 3 commits into
Conversation
08d6a5a to
26caf95
Compare
There was a problem hiding this comment.
Code Review
This pull request introduces a use_posix_spawn parameter to start_ray_process to enable a fork-safe spawning path for Ray Client specific servers, which helps avoid crashes in multi-threaded gRPC environments. Feedback was provided regarding the use of close_fds=False, which may cause file descriptor leaks; the reviewer suggested setting it to True as modern CPython versions can still utilize the posix_spawn fast path while closing descriptors.
| @@ -1048,7 +1055,8 @@ def preexec_fn(): | |||
| stdout=stdout_file, | |||
| stderr=stderr_file, | |||
| stdin=subprocess.PIPE if pipe_stdin else None, | |||
| preexec_fn=preexec_fn if sys.platform != "win32" else None, | |||
| preexec_fn=(None if sys.platform == "win32" or use_posix_spawn else preexec_fn), | |||
| close_fds=not use_posix_spawn, | |||
There was a problem hiding this comment.
Setting close_fds=False when use_posix_spawn is enabled causes the child process to inherit all open file descriptors from the parent (except those explicitly marked with O_CLOEXEC). In a multi-threaded environment like the Ray Client proxier, this can lead to significant file descriptor leaks. For example, the child process might inherit listening sockets or active client connections, which can prevent the parent from restarting or lead to resource exhaustion.
CPython 3.9+ is capable of using the posix_spawn fast path even with close_fds=True on most modern POSIX systems (using posix_spawn_file_actions_addclose). Unless you have confirmed that close_fds=True forces a fallback to fork() on your target platforms and that this fallback is unacceptable, it is safer to maintain close_fds=True.
| close_fds=not use_posix_spawn, | |
| close_fds=True, |
There was a problem hiding this comment.
Addressed in the current version. start_ray_process() now keeps close_fds=True when the runtime exposes os.POSIX_SPAWN_CLOSEFROM, and only falls back to False for older runtimes where closing fds would force the unsafe fork path. Added coverage for both branches in test_start_ray_process_posix_spawn_close_fds_when_supported and test_start_ray_process_posix_spawn_leaves_fds_open_for_older_runtime.
There was a problem hiding this comment.
Handled in the current branch. start_ray_process() now keeps close_fds=True when the runtime exposes os.POSIX_SPAWN_CLOSEFROM, and only falls back to close_fds=False on older runtimes where that is needed to avoid the fork path. Added tests cover both branches.
| @@ -1048,7 +1055,8 @@ def preexec_fn(): | |||
| stdout=stdout_file, | |||
| stderr=stderr_file, | |||
| stdin=subprocess.PIPE if pipe_stdin else None, | |||
| preexec_fn=preexec_fn if sys.platform != "win32" else None, | |||
| preexec_fn=(None if sys.platform == "win32" or use_posix_spawn else preexec_fn), | |||
There was a problem hiding this comment.
Specific-server child loses SIGINT blocking from preexec_fn
Low Severity
When use_posix_spawn is True, preexec_fn is set to None, which silently drops the signal.pthread_sigmask(signal.SIG_BLOCK, {signal.SIGINT}) call that every other Ray child process receives. A SIGINT delivered to the process group (e.g., Ctrl+C) will now reach the specific-server as an unhandled KeyboardInterrupt, potentially causing an unclean mid-operation crash rather than an orderly proxier-driven teardown.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 26caf95cbba474957220a5564bd2ab13b642159e. Configure here.
There was a problem hiding this comment.
Fixed in 6d809e568b. The specific-server process now blocks SIGINT in its own entrypoint via _block_sigint_for_specific_server(), so the fork-safe posix_spawn path still avoids preexec_fn while preserving the previous child-side SIGINT mask behavior. Added test_ray_client_specific_server_blocks_sigint to cover this.
There was a problem hiding this comment.
Handled in the current branch. Specific-server startup now calls _block_sigint_for_specific_server() inside server.py, preserving the SIGINT mask even when the process is launched through the posix_spawn path without preexec_fn.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 26caf95cbb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| assert captured["process_type"] == ray_constants.PROCESS_TYPE_RAY_CLIENT_SERVER | ||
| assert "--mode=specific-server" in captured["command"] | ||
| assert captured["kwargs"]["fate_share"] is False | ||
| assert captured["kwargs"]["use_posix_spawn"] is True |
There was a problem hiding this comment.
Make spawn assertion platform-aware in specific-server test
start_ray_client_server() only enables use_posix_spawn on non-Windows (server_type == "specific-server" and sys.platform != "win32" in python/ray/_private/services.py), but this test unconditionally asserts it is True. On Windows runners, the code correctly passes False and this assertion fails, creating a deterministic cross-platform test regression rather than validating the intended behavior.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed by making the test explicitly platform-scoped: it now monkeypatches services.sys.platform to linux before asserting the POSIX-specific use_posix_spawn=True behavior. The duplicate Cursor thread is already marked resolved.
There was a problem hiding this comment.
Handled in the current branch. The test now patches services.sys.platform to linux before invoking start_ray_client_server(), so it only asserts the POSIX fork-safe path.
Signed-off-by: JerryLee <223425819+Jerry2003826@users.noreply.github.com>
26caf95 to
07fe244
Compare
|
The Cursor bot flagged that dropping preexec_fn removes SIGINT blocking for specific-server subprocesses. Looking at start_ray_process(), there is already an os.setpgrp() call that moves every child Ray process into its own process group. That means a user Ctrl+C never reaches specific-server subprocesses regardless of pthread_sigmask, so the signal blocking was redundant on this path. The fate_share concern is worth a closer look though. detect_fate_sharing_support() returns True on most Linux kernels, so set_kill_on_parent_death_linux() would have been active before this change. Specific servers are also cleaned up by the proxier directly, but it may be worth confirming they are still reliably terminated if the proxier dies unexpectedly. |
|
Addressed the fate-sharing cleanup concern in What changed:
On the SIGINT point: Validation:
I also retried the focused pytest, but this local checkout still cannot import Ray's native extension: Attempted command: PYTHONPATH=python python -m pytest python/ray/tests/test_debug_tools.py::test_start_ray_client_specific_server_uses_fork_safe_spawn python/ray/tests/test_debug_tools.py::test_ray_client_specific_server_parent_liveness python/ray/tests/test_debug_tools.py::test_ray_client_specific_server_parent_liveness_missing -q |
80d7465 to
4396f99
Compare
|
Follow-up pushed in The parent monitor now first verifies that the specific server is still directly parented by the recorded proxier PID via I also added a focused reparented-parent unit test and kept the previous missing-PID test. Validation:
DCO is now passing. The focused pytest remains blocked in this local checkout by the missing native extension: |
|
I checked the failing The failure appears to be unrelated to this PR: the doctest timed out 3/3 attempts while downloading CIFAR10 (
Could someone retry the failed microcheck job? |
4396f99 to
6d809e5
Compare
|
Follow-up pushed in What changed:
Validation:
Focused pytest is still blocked in this local checkout by the missing native extension: |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
Reviewed by Cursor Bugbot for commit 6d809e5. Configure here.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6d809e568b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| calls.append((how, signals)) | ||
|
|
||
| with monkeypatch.context() as m: | ||
| m.setattr(ray_client_server.signal, "pthread_sigmask", fake_pthread_sigmask) |
There was a problem hiding this comment.
Make SIGINT-mask test platform-aware
When this test runs on Windows or any Python build without signal.pthread_sigmask, monkeypatch.setattr(..., raising=True) fails before the helper can exercise its hasattr guard. Since _block_sigint_for_specific_server() is intentionally a no-op on those platforms, this creates a deterministic cross-platform test failure rather than validating the intended behavior; skip/branch the test or patch with raising=False and assert the no-op case.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 52992f3e06. _block_sigint_for_specific_server() now requires both pthread_sigmask and SIG_BLOCK; the tests patch POSIX signal support with raising=False and add no-op coverage when SIG_BLOCK is unavailable.
There was a problem hiding this comment.
Handled in the current branch. The SIGINT-mask test now patches signal.pthread_sigmask and signal.SIG_BLOCK with raising=False, and there is a separate no-op test for runtimes without the POSIX signal-mask support.
6d809e5 to
52992f3
Compare
|
Updated in Addressed the latest automated feedback:
Validation:
Targeted pytest remains blocked in this local checkout because Ray native extension
|
Yicheng-Lu-llll
left a comment
There was a problem hiding this comment.
Thanks for the contribution! using spawn makes sense to me. I left one comments for the first round reviewing.
| @@ -851,6 +852,36 @@ def ray_connect_handler(job_config: JobConfig = None, **ray_init_kwargs): | |||
| return ray_connect_handler | |||
|
|
|||
|
|
|||
| def _parent_process_is_alive(parent_pid: Optional[int]) -> bool: | |||
There was a problem hiding this comment.
Repeating the lookup might not be ideal. You could use a pipe, see here:
ray/python/ray/_private/ray_process_reaper.py
Lines 7 to 17 in 5d2c4e7
There was a problem hiding this comment.
Implemented the pipe-based parent monitor in the current branch.
Specific-server startup now passes --monitor-parent-pipe and pipe_stdin=True when using the fork-safe posix_spawn path. The server starts a daemon thread that reads stdin and marks the parent as exited on EOF, matching the ray_process_reaper.py pattern instead of polling getppid() / kill(pid, 0).
Coverage added/updated:
- specific-server startup asserts
--monitor-parent-pipe,pipe_stdin=True,use_posix_spawn=True, andfate_share=False - parent-pipe liveness checks use a
threading.Event - the monitor helper sets the event on stdin EOF
- startup exits before serving when the monitored parent pipe is already closed
Validated:
python -m compileall -q python/ray/_private/services.py python/ray/util/client/server/server.py python/ray/tests/test_debug_tools.pyruff check python/ray/_private/services.py python/ray/util/client/server/server.py python/ray/tests/test_debug_tools.pyruff format --check python/ray/_private/services.py python/ray/util/client/server/server.py python/ray/tests/test_debug_tools.pygit diff --check
I also tried PYTHONPATH=python python -m pytest python/ray/tests/test_debug_tools.py -k "specific_server" -q; this local checkout cannot import ray._raylet, so the pytest run stops in python/ray/tests/conftest.py before collecting the target tests.
52992f3 to
caa1088
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: caa1088224
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| stderr=stderr_file, | ||
| stdin=subprocess.PIPE if pipe_stdin else None, | ||
| preexec_fn=preexec_fn if sys.platform != "win32" else None, | ||
| preexec_fn=(None if sys.platform == "win32" or use_posix_spawn else preexec_fn), |
There was a problem hiding this comment.
Block SIGINT before spawning the specific server
For POSIX specific-server launches, this drops the preexec_fn that used to block SIGINT before exec, while the replacement mask is only applied after setup_worker execs and server.main() reaches _block_sigint_for_specific_server(). If a user sends Ctrl-C while the runtime-env setup/import path is still running, the child can receive SIGINT and exit before installing its own mask, regressing Ray's intended isolation of subprocesses from driver interrupts. Consider having the spawning thread temporarily block SIGINT around the posix_spawn call so the child inherits the mask without using fork/preexec.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Handled in the current branch. start_ray_process() now temporarily blocks SIGINT in the spawning thread around the posix_spawn/ConsolePopen call and restores the previous mask afterward. The child inherits the blocked mask before setup_worker starts, without reintroducing preexec_fn.
Added coverage in test_start_ray_process_posix_spawn_blocks_sigint_for_child.
Validated:
python -m compileall -q python/ray/_private/services.py python/ray/_private/workers/setup_worker.py python/ray/util/client/server/server.py python/ray/tests/test_debug_tools.pyruff check python/ray/_private/services.py python/ray/_private/workers/setup_worker.py python/ray/util/client/server/server.py python/ray/tests/test_debug_tools.pyruff format --check python/ray/_private/services.py python/ray/_private/workers/setup_worker.py python/ray/util/client/server/server.py python/ray/tests/test_debug_tools.pygit diff --check
Targeted pytest still cannot collect locally because this checkout lacks the compiled ray._raylet extension.
|
|
||
| with monkeypatch.context() as m: | ||
| m.setattr(ray_client_server.sys, "stdin", FakeStdin()) | ||
| parent_pipe_closed = ray_client_server._start_parent_pipe_monitor(True) |
There was a problem hiding this comment.
Keep fake stdin patched until the monitor thread reads it
When the daemon monitor thread is scheduled only after this monkeypatch.context() exits, _start_parent_pipe_monitor() resolves sys.stdin in the thread and can read the restored real pytest/direct-run stdin instead of FakeStdin; with -s or direct invocation that read can block, making parent_pipe_closed.wait(timeout=1) fail. Keep the wait inside the monkeypatch context or capture the stdin object before starting the thread.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Handled in the current branch. Both parent-pipe monitor helpers now capture the stdin object before starting the daemon thread, so the test cannot race with monkeypatch.context() restoration.
The test now joins the monitor thread while the fake stdin is still patched.
Validated with compileall, Ruff check, Ruff format check, and git diff --check. Targeted pytest still cannot collect locally because this checkout lacks ray._raylet.
| process_fate_share = False if use_posix_spawn else fate_share | ||
| if use_posix_spawn: | ||
| command.append("--monitor-parent-pipe") |
There was a problem hiding this comment.
Preserve parent-death cleanup during server bootstrap
When a POSIX specific server is launched, this disables kernel fate sharing immediately, but the replacement stdin monitor is only started later after setup_worker has finished and ray.util.client.server.main() begins. If the proxier process dies during that bootstrap/import/runtime-env command path, the child no longer gets killed by fate sharing and has not yet started monitoring stdin EOF, so it can continue running orphaned until it eventually reaches server startup. Start the parent-pipe monitor in the setup-worker phase or keep a startup-time fate-sharing mechanism until the monitor is active.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Handled in the current branch. The parent-pipe monitor now starts in setup_worker.py before RuntimeEnvContext.deserialize() / exec_worker(), so parent death during runtime-env/bootstrap exits the child early. Because exec_worker() replaces the process with os.execvp, setup_worker also forwards --monitor-parent-pipe to the final Ray Client server, which starts its own post-exec stdin EOF monitor.
This preserves parent-death cleanup both before and after the exec boundary while keeping the fork-safe spawn path.
Validated with compileall, Ruff check, Ruff format check, and git diff --check. Targeted pytest still cannot collect locally because this checkout lacks ray._raylet.
caa1088 to
e90156c
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e90156cff0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if __name__ == "__main__": | ||
| setup_logger(LOGGER_LEVEL, LOGGER_FORMAT) | ||
| args, remaining_args = parser.parse_known_args() | ||
| _start_parent_pipe_monitor(args.monitor_parent_pipe) |
There was a problem hiding this comment.
Keep parent monitor alive through runtime-env exec
When --monitor-parent-pipe is used with a non-empty runtime-env command_prefix, this monitor thread only exists until runtime_env_context.exec_worker(...) replaces setup_worker via os.execvp("bash", ...) in python/ray/_private/runtime_env/context.py; the daemon thread is then gone while the shell runs the prefix and before ray.util.client.server starts its own monitor. In that interval, if the proxier dies, stdin EOF is not consumed and fate_share has already been disabled for the specific server, so the child can still be orphaned during bootstrap. The monitor needs to survive the exec path or the parent-death mechanism needs to remain active until the final server monitor is running.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Handled in the current branch. The parent-pipe monitor is no longer a daemon thread inside setup_worker; _start_parent_pipe_monitor() now starts a small subprocess that inherits the same stdin pipe and targets the setup_worker PID. That process survives RuntimeEnvContext.exec_worker() / os.execvp(...), so it continues covering the runtime-env command-prefix window and the final Ray Client server process. It also exits when the target PID is gone to avoid leaking after normal server shutdown.
The final server no longer starts a second stdin reader, so there is only one monitor consuming the inherited pipe.
Validated:
python -m compileall -q python/ray/_private/services.py python/ray/_private/workers/setup_worker.py python/ray/util/client/server/server.py python/ray/tests/test_debug_tools.pyruff check python/ray/_private/services.py python/ray/_private/workers/setup_worker.py python/ray/util/client/server/server.py python/ray/tests/test_debug_tools.pyruff format --check python/ray/_private/services.py python/ray/_private/workers/setup_worker.py python/ray/util/client/server/server.py python/ray/tests/test_debug_tools.pygit diff --check
I also retried the targeted pytest command, but this local checkout still cannot collect Ray tests because it lacks the compiled ray._raylet extension.
| with monkeypatch.context() as m: | ||
| m.setattr(services.sys, "platform", "linux") | ||
| m.setattr(services.os, "POSIX_SPAWN_CLOSEFROM", object(), raising=False) | ||
| m.setattr(services.signal, "pthread_sigmask", fake_pthread_sigmask) |
There was a problem hiding this comment.
Make pthread_sigmask patch portable
The current test still patches services.signal.pthread_sigmask with the default raising=True; on Windows/Python builds where signal.pthread_sigmask is absent, this fails before the test reaches the code under test even though the test monkeypatches services.sys.platform to linux. I checked the current test after the earlier platform-awareness fixes, and this remaining patch should also use raising=False or be skipped/branched for runtimes without POSIX signal-mask support.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Handled in the current branch. The pthread_sigmask monkeypatch now uses raising=False, matching the existing platform guard in the code under test and keeping the test portable on runtimes where signal.pthread_sigmask is not present.
Validated with compileall, Ruff check, Ruff format check, and git diff --check. Targeted pytest still cannot collect locally because this checkout lacks ray._raylet.
Signed-off-by: JerryLee <223425819+Jerry2003826@users.noreply.github.com>
e90156c to
6915826
Compare


Fixes #63202
Summary
use_posix_spawnpath tostart_ray_process()for POSIX subprocesses that do not require kernel fate sharing.specific-serversubprocesses, which are spawned by the gRPC proxier while other gRPC poller threads may be active.specific-serverlaunches; the proxy server path and other Ray processes keep their existing startup behavior.Root Cause
Ray Client's proxier is a multi-threaded gRPC server. Starting a per-client specific server currently goes through
subprocess.Popen(..., preexec_fn=...), which forces a fork+preexec path on POSIX. When other proxier threads are inside gRPC at the same time, gRPC can skip fork handlers and the child can inherit an unsafe poller state before it opens its own channel.Specific servers already self-terminate after being idle and are also cleaned up by the proxier, so this path can trade Linux kernel fate sharing for CPython's
posix_spawn-eligible subprocess path.Validation
python -m compileall -q python/ray/_private/services.py python/ray/tests/test_debug_tools.pyruff check python/ray/_private/services.py python/ray/tests/test_debug_tools.pyI could not run the focused pytest locally in this checkout because Ray's native
_rayletextension is not built:Attempted command: