Bug Description
v0.24.1 introduced prctl(PR_SET_PDEATHSIG, SIGKILL) in Chrome's pre_exec hook (PR #1137) to prevent orphaned Chrome processes when the daemon is SIGKILL'd. However, this causes Chrome to be killed after ~7 seconds of idle time on every page, every site.
Root Cause
PR_SET_PDEATHSIG tracks the thread that called fork(), not the process. This is documented in the prctl(2) man page:
"the 'parent' in this case is considered to be the thread that created this process"
The agent-browser daemon uses tokio's multi-threaded runtime. The worker thread that spawns Chrome via Command::spawn() (which calls fork()) gets recycled by tokio after a few seconds of idle time. When the thread exits, the kernel sends SIGKILL to Chrome — even though the daemon process is still alive.
Reproduction
# v0.24.0 — works fine
agent-browser open https://example.com
sleep 15
agent-browser get url # → https://example.com/ ✅
# v0.24.1 — Chrome dies
agent-browser open https://example.com
sleep 10
agent-browser get url # → about:blank ❌ (Chrome was killed and auto-relaunched)
Tested with:
- Zero env vars, zero custom args, pure v0.24.1 binary
- Happens on every site (example.com, wikipedia.org, etc.)
- Happens in both headless and headed mode
- Does NOT happen on v0.24.0 or v0.23.0
Impact
- All pages navigate to
about:blank after ~7-9 seconds
- Download workflows are completely broken (first download works, second fails because Chrome died between actions)
- Live preview streaming shows resolution changes (Chrome auto-relaunches with default viewport)
- Any automation that takes >7 seconds between commands fails
Proposed Fix
Replace PR_SET_PDEATHSIG with a sentinel process + keepalive pipe:
- Before spawning Chrome, create a pipe
- After Chrome starts, fork a tiny sentinel process that:
- Joins Chrome's process group (
setpgid)
- Blocks on reading the pipe's read end
- Daemon keeps the pipe's write end open (process-scoped fd, shared by all threads)
- When daemon dies (ANY reason including SIGKILL):
- Kernel closes all daemon fds → pipe breaks
- Sentinel reads EOF → kills Chrome process group via
kill(-pgid, SIGKILL)
This correctly handles:
- ✅ Tokio worker thread recycling (pipe fd is process-scoped, not thread-scoped)
- ✅ Daemon SIGKILL'd (kernel closes pipe → sentinel kills Chrome)
- ✅ Daemon graceful exit (process group kill in Drop + pipe close)
- ✅ Works on Linux (no macOS equivalent needed — process group kill handles macOS)
Validation
Built v0.24.1 with the sentinel fix:
- Page stays for 15+ seconds idle ✅
- 5 downloads with 10s delays work ✅
- SIGKILL daemon → Chrome + all helpers + sentinel = DEAD, zero orphans ✅
Alternative Approaches Considered
| Approach |
Why not |
pidfd_open (Linux 5.3+) |
Not available on older kernels |
Dedicated std::thread for spawn |
Workaround, doesn't fix the fundamental issue |
PID polling (getppid() == 1) |
Polling delay, not instant |
| Remove PR_SET_PDEATHSIG entirely |
Loses SIGKILL orphan prevention |
Related
Bug Description
v0.24.1 introduced
prctl(PR_SET_PDEATHSIG, SIGKILL)in Chrome'spre_exechook (PR #1137) to prevent orphaned Chrome processes when the daemon is SIGKILL'd. However, this causes Chrome to be killed after ~7 seconds of idle time on every page, every site.Root Cause
PR_SET_PDEATHSIGtracks the thread that calledfork(), not the process. This is documented in theprctl(2)man page:The agent-browser daemon uses tokio's multi-threaded runtime. The worker thread that spawns Chrome via
Command::spawn()(which callsfork()) gets recycled by tokio after a few seconds of idle time. When the thread exits, the kernel sends SIGKILL to Chrome — even though the daemon process is still alive.Reproduction
Tested with:
Impact
about:blankafter ~7-9 secondsProposed Fix
Replace
PR_SET_PDEATHSIGwith a sentinel process + keepalive pipe:setpgid)kill(-pgid, SIGKILL)This correctly handles:
Validation
Built v0.24.1 with the sentinel fix:
Alternative Approaches Considered
pidfd_open(Linux 5.3+)std::threadfor spawngetppid() == 1)Related
PR_SET_PDEATHSIG