Summary
handleSessionEnd cleans up the broker by calling loadBrokerSession(input.cwd), but broker.json is stored under a cwd-derived hash (resolveStateDir(cwd)). When the broker was spawned for a different cwd than the session's cwd, the lookup misses, endpoint is null, no broker/shutdown is sent, and the broker is orphaned — even when the session exits gracefully (/quit, SIGTERM).
This is distinct from #108 (which attributes orphans to the hook not firing on crash/kill). Here the hook fires normally (exit 0) but cannot locate the broker.
Why this happens in normal use
The broker cwd is decided by resolveCommandCwd (codex-companion.mjs:149):
options.cwd ? path.resolve(process.cwd(), options.cwd) : process.cwd()
→ flows to connect(cwd) → ensureBrokerSession(cwd) → broker --cwd and broker.json location.
A very common workflow breaks this:
- Session starts in the main tree (session cwd =
/repo).
- A review runs against a git worktree — either
cd worktree && node companion … or node companion … --cwd worktree → broker cwd = worktree.
/quit → SessionEnd input.cwd = /repo → loadBrokerSession(/repo) looks at the /repo hash, never finds the worktree broker.json → broker survives.
Reviewing a worktree's uncommitted working tree requires broker cwd = worktree, so for a main-tree session this mismatch is structural, not accidental.
Reproduction (verified)
# broker registered for cwd=A
ensureBrokerSession("/tmp/probe") # broker --cwd=/tmp/probe, alive
# SessionEnd with matching cwd -> broker is killed (OK)
echo '{"cwd":"/tmp/probe","session_id":"s1","hook_event_name":"SessionEnd"}' \
| node session-lifecycle-hook.mjs SessionEnd # exit 0, broker DEAD
# SessionEnd with mismatched cwd -> hook exits 0 but broker SURVIVES
echo '{"cwd":"/tmp","session_id":"s2","hook_event_name":"SessionEnd"}' \
| node session-lifecycle-hook.mjs SessionEnd # exit 0, broker ALIVE <-- bug
Observed in the wild: every live session had cwd in the main tree, while every orphan broker's --cwd pointed at a worktree — a perfect mismatch pattern.
Root-cause fix (preferred)
Decouple broker cleanup from cwd:
- A. Track brokers by session, not cwd-hash. Record the owning
session_id in broker.json (and/or a global broker index). handleSessionEnd resolves brokers by input.session_id and tears down all of them regardless of cwd. This fixes the graceful-exit mismatch directly.
- B. Parent-session liveness watchdog in the broker. The broker records the owning session PID and periodically checks
process.kill(pid, 0); if the owner is gone it self-terminates. This additionally covers SIGKILL / crash / OOM, where SessionEnd never fires at all.
A handles graceful exits; B handles abnormal exits. Together they remove the cwd dependency entirely.
Fallback (defense-in-depth, not a root fix)
An idle timeout (#108) as a last-resort net if both A and B fail. Useful but secondary — it lets brokers linger for the timeout window and masks the real ownership/lookup defect.
Related
Environment
- Plugin
openai-codex/codex v1.0.4
- Linux; reproduced directly against
session-lifecycle-hook.mjs / broker-lifecycle.mjs
Summary
handleSessionEndcleans up the broker by callingloadBrokerSession(input.cwd), butbroker.jsonis stored under a cwd-derived hash (resolveStateDir(cwd)). When the broker was spawned for a different cwd than the session's cwd, the lookup misses,endpointis null, nobroker/shutdownis sent, and the broker is orphaned — even when the session exits gracefully (/quit, SIGTERM).This is distinct from #108 (which attributes orphans to the hook not firing on crash/kill). Here the hook fires normally (exit 0) but cannot locate the broker.
Why this happens in normal use
The broker cwd is decided by
resolveCommandCwd(codex-companion.mjs:149):→ flows to
connect(cwd)→ensureBrokerSession(cwd)→ broker--cwdandbroker.jsonlocation.A very common workflow breaks this:
/repo).cd worktree && node companion …ornode companion … --cwd worktree→ broker cwd =worktree./quit→ SessionEndinput.cwd=/repo→loadBrokerSession(/repo)looks at the/repohash, never finds theworktreebroker.json → broker survives.Reviewing a worktree's uncommitted working tree requires broker cwd = worktree, so for a main-tree session this mismatch is structural, not accidental.
Reproduction (verified)
Observed in the wild: every live session had cwd in the main tree, while every orphan broker's
--cwdpointed at a worktree — a perfect mismatch pattern.Root-cause fix (preferred)
Decouple broker cleanup from cwd:
session_idinbroker.json(and/or a global broker index).handleSessionEndresolves brokers byinput.session_idand tears down all of them regardless of cwd. This fixes the graceful-exit mismatch directly.process.kill(pid, 0); if the owner is gone it self-terminates. This additionally covers SIGKILL / crash / OOM, where SessionEnd never fires at all.A handles graceful exits; B handles abnormal exits. Together they remove the cwd dependency entirely.
Fallback (defense-in-depth, not a root fix)
An idle timeout (#108) as a last-resort net if both A and B fail. Useful but secondary — it lets brokers linger for the timeout window and masks the real ownership/lookup defect.
Related
Environment
openai-codex/codexv1.0.4session-lifecycle-hook.mjs/broker-lifecycle.mjs