Skip to content

feat(worker): add CLAUDE_MEM_WORKER_AUTOSTART to opt out of hook worker lazy-spawn#2828

Open
surfingdoggo wants to merge 1 commit into
thedotmack:mainfrom
surfingdoggo:feat/worker-autostart-flag
Open

feat(worker): add CLAUDE_MEM_WORKER_AUTOSTART to opt out of hook worker lazy-spawn#2828
surfingdoggo wants to merge 1 commit into
thedotmack:mainfrom
surfingdoggo:feat/worker-autostart-flag

Conversation

@surfingdoggo

Copy link
Copy Markdown

Problem

Hooks lazy-spawn the worker daemon via ensureWorkerAliveOnce() whenever the worker port is dead. That's correct for worker-runtime users — but server-beta-only or externally-managed deployments have no way to stop hook activity from continually resurrecting the worker daemon.

Fix

Add CLAUDE_MEM_WORKER_AUTOSTART (default true). When false, ensureWorkerAliveOnce() returns false without spawning.

  • Default true preserves existing behavior exactly — no-op for current users.
  • Opting out (false) is for operators who drive the worker out-of-band and knowingly give up worker-only features (data viewer, corpus/skills, semantic injection — already unsupported on server-beta per docs/server-beta-parity-map.md).

Changes

  • Setting + opt-out short-circuit at the single ensureWorkerAliveOnce() chokepoint.
  • Unit tests for the opt-out / default / explicit-true cases, plus a resetAliveCache() test helper mirroring the existing clearPortCache().
  • Documented in the configuration reference.

Note on CI

main currently has unrelated pre-existing failures (PID-file / process-manager and logger-standards tests, plus a duplicate ModeManager import typecheck error). This change is isolated to the worker lazy-spawn path; the new test suite passes.

🤖 Generated with Claude Code

…er lazy-spawn

Hooks lazy-spawn the worker daemon via ensureWorkerAliveOnce() whenever the
worker port is dead. That's correct for worker-runtime users, but server-beta
or externally-managed deployments have no way to stop hook activity from
resurrecting the worker daemon.

Add CLAUDE_MEM_WORKER_AUTOSTART (default 'true'); when 'false',
ensureWorkerAliveOnce() returns false without spawning. Default preserves
existing behavior exactly — this is an opt-out for operators who don't use the
worker-only features (data viewer, corpus/skills, semantic injection — all
listed unsupported on server-beta in docs/server-beta-parity-map.md).

- Unit tests for the opt-out, default, and explicit-true cases (adds a
  resetAliveCache() test helper mirroring clearPortCache()).
- Documented in the configuration reference.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@greptile-apps

greptile-apps Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds CLAUDE_MEM_WORKER_AUTOSTART (default true) to let server-beta-only or externally-managed deployments opt out of the lazy-spawn behavior in ensureWorkerAliveOnce(), without changing anything for the default case.

  • worker-utils.ts: Short-circuits ensureWorkerAliveOnce() before calling ensureWorkerRunning() when the setting is 'false', and exports resetAliveCache() as a test helper.
  • SettingsDefaultsManager.ts: Adds the new key with a default of 'true' to the settings interface and defaults map.
  • worker-autostart-flag.test.ts: Tests the three cases (opt-out, default, explicit-true) against ensureWorkerAliveOnce() directly, but does not cover the executeWithWorkerFallback path where a false return triggers recordWorkerUnreachable().

Confidence Score: 3/5

The opt-out flag works as advertised for ensureWorkerAliveOnce in isolation, but the false return flows directly into recordWorkerUnreachable() in executeWithWorkerFallback, generating blocking error alerts after only a handful of hook calls on any deployment that uses the new flag.

The new short-circuit in ensureWorkerAliveOnce() returns false when AUTOSTART=false, but executeWithWorkerFallback treats any false return as a live worker failure: it calls recordWorkerUnreachable() unconditionally, and after FAIL_LOUD_DEFAULT_THRESHOLD (3) consecutive hook invocations the emitBlockingError path fires and exits the hook process with code 2. For a server-beta deployment that makes several hook calls per session, this happens almost immediately. The feature is usable today only on the default (true) path; the opt-out path reliably produces the very noise it was designed to suppress.

src/shared/worker-utils.ts — specifically the interaction between ensureWorkerAliveOnce and executeWithWorkerFallback when AUTOSTART=false.

Important Files Changed

Filename Overview
src/shared/worker-utils.ts Adds AUTOSTART short-circuit in ensureWorkerAliveOnce(), but the false return propagates into executeWithWorkerFallback which calls recordWorkerUnreachable() unconditionally — producing false "worker unreachable" blocking errors for opt-out deployments.
src/shared/SettingsDefaultsManager.ts Adds CLAUDE_MEM_WORKER_AUTOSTART with default 'true' to the settings interface and defaults map — straightforward and correct.
docs/public/configuration.mdx Documents the new CLAUDE_MEM_WORKER_AUTOSTART setting with its default and opt-out semantics — clear and accurate.
tests/shared/worker-autostart-flag.test.ts New test suite covers opt-out, default, and explicit-true cases for ensureWorkerAliveOnce(); does not test the interaction with executeWithWorkerFallback and the failure counter, which is where the real bug lives.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Hook invokes executeWithWorkerFallback] --> B[ensureWorkerAliveOnce]
    B --> C{aliveCache !== null?}
    C -- yes --> D[return cached value]
    C -- no --> E{AUTOSTART == 'false'?}
    E -- yes --> F[aliveCache = false\nreturn false]
    E -- no --> G[ensureWorkerRunning]
    G --> H[aliveCache = result\nreturn result]
    D --> I{alive?}
    F --> I
    H --> I
    I -- true --> J[Make HTTP request to worker]
    I -- false --> K[recordWorkerUnreachable ⚠️\nfires even when AUTOSTART=false]
    K --> L{consecutiveFailures\n>= threshold?}
    L -- yes --> M[emitBlockingError\nexit 2 💥]
    L -- no --> N[return WorkerFallback continue=true]
Loading

Comments Outside Diff (1)

  1. src/shared/worker-utils.ts, line 444-448 (link)

    P1 False "worker unreachable" alarms when AUTOSTART=false

    When AUTOSTART=false, ensureWorkerAliveOnce() returns false, and executeWithWorkerFallback unconditionally calls recordWorkerUnreachable(). After FAIL_LOUD_DEFAULT_THRESHOLD (default 3) hook invocations — which for any active Claude session happens within seconds — emitBlockingError() fires with "claude-mem worker unreachable for N consecutive hooks." This exits the hook with code 2, producing an alarming error for users who explicitly opted out of the worker. The opt-out is supposed to be a silent no-op for server-beta deployments, but instead it triggers the exact loud failure path it was designed to avoid. The fix is to distinguish "disabled by config" from "unreachable": either make ensureWorkerAliveOnce return a tri-state (or throw a typed sentinel), or check the AUTOSTART flag in executeWithWorkerFallback before calling recordWorkerUnreachable().

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/shared/worker-utils.ts
    Line: 444-448
    
    Comment:
    **False "worker unreachable" alarms when AUTOSTART=false**
    
    When `AUTOSTART=false`, `ensureWorkerAliveOnce()` returns `false`, and `executeWithWorkerFallback` unconditionally calls `recordWorkerUnreachable()`. After `FAIL_LOUD_DEFAULT_THRESHOLD` (default 3) hook invocations — which for any active Claude session happens within seconds — `emitBlockingError()` fires with `"claude-mem worker unreachable for N consecutive hooks."` This exits the hook with code 2, producing an alarming error for users who explicitly opted out of the worker. The opt-out is supposed to be a silent no-op for server-beta deployments, but instead it triggers the exact loud failure path it was designed to avoid. The fix is to distinguish "disabled by config" from "unreachable": either make `ensureWorkerAliveOnce` return a tri-state (or throw a typed sentinel), or check the `AUTOSTART` flag in `executeWithWorkerFallback` before calling `recordWorkerUnreachable()`.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
src/shared/worker-utils.ts:444-448
**False "worker unreachable" alarms when AUTOSTART=false**

When `AUTOSTART=false`, `ensureWorkerAliveOnce()` returns `false`, and `executeWithWorkerFallback` unconditionally calls `recordWorkerUnreachable()`. After `FAIL_LOUD_DEFAULT_THRESHOLD` (default 3) hook invocations — which for any active Claude session happens within seconds — `emitBlockingError()` fires with `"claude-mem worker unreachable for N consecutive hooks."` This exits the hook with code 2, producing an alarming error for users who explicitly opted out of the worker. The opt-out is supposed to be a silent no-op for server-beta deployments, but instead it triggers the exact loud failure path it was designed to avoid. The fix is to distinguish "disabled by config" from "unreachable": either make `ensureWorkerAliveOnce` return a tri-state (or throw a typed sentinel), or check the `AUTOSTART` flag in `executeWithWorkerFallback` before calling `recordWorkerUnreachable()`.

Reviews (1): Last reviewed commit: "feat(worker): add CLAUDE_MEM_WORKER_AUTO..." | Re-trigger Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant