Skip to content

Commit 3b99350

Browse files
iiiokojiadbiclaude
andcommitted
fix: disable PR_SET_PDEATHSIG (kernel binds it to parent thread, not process)
PR_SET_PDEATHSIG is bound to the *thread* that forked the child, not to the parent process (man 2 prctl: "the 'parent' in this case is considered to be the thread that created this process"). rkllama_server runs Flask with threaded=True, so Process.start() for a worker is executed from a short-lived request-handler thread. As soon as the request finishes and its thread exits, the kernel delivers SIGTERM to the worker, the inherited shutdown handler cascades into stop_all() / sys.exit(0), and the worker dies after serving a single request. The next /api/embed hits the dying worker, waits the 30s stop_worker timeout, and returns 500. Turn _set_parent_death_signal() into a documented no-op. Orphan-worker protection continues to work via _kill_orphaned_workers() at startup. Fixes #117. Co-Authored-By: Claude <noreply@anthropic.com>
1 parent d8da3b2 commit 3b99350

1 file changed

Lines changed: 5 additions & 10 deletions

File tree

src/rkllama/api/worker.py

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -31,17 +31,12 @@
3131

3232

3333
def _set_parent_death_signal():
34-
"""On Linux, ask the kernel to SIGTERM us if our parent dies.
35-
36-
Safe no-op on other platforms.
34+
"""No-op: PR_SET_PDEATHSIG is bound to the forking *thread*, not the
35+
process (see ``man 2 prctl``). Flask runs with ``threaded=True``, so
36+
enabling it kills the worker as soon as its request-thread exits
37+
(issue #117). Orphan cleanup is handled by ``_kill_orphaned_workers``.
3738
"""
38-
if sys.platform != "linux":
39-
return
40-
try:
41-
libc = ctypes.CDLL("libc.so.6", use_errno=True)
42-
libc.prctl(PR_SET_PDEATHSIG, signal.SIGTERM, 0, 0, 0)
43-
except Exception as exc:
44-
logger.warning("Could not set parent-death signal: %s", exc)
39+
return
4540

4641

4742
def _kill_orphaned_workers():

0 commit comments

Comments
 (0)