[CLAUDE ROUTINE]: Reliability enhancement — propagate the 10s custom-hook timeout to the hook itself via `AbortSignal` so a slow hook stops doing work the moment we stop waiting for it

## Summary

When a custom hook exceeds the 10-second timeout in `handler.ts`, we resolve the `Promise.race`, log a warning, and return `allow()` — which is the right user-facing behaviour. What we don't do is *tell the hook* that we've moved on, so the hook function keeps running in the background until either the Node process exits or the work it kicked off (HTTP requests, child processes, intervals) completes on its own. For most hooks this is invisible; for hooks that hold network sockets, file descriptors, or LLM API calls, it means we're paying for work whose result we'll never read.

Threading an `AbortSignal` through the `PolicyContext` lets well-written hooks bail cleanly the instant we time out, and costs nothing for hooks that ignore the signal. This is a friendly, opt-in upgrade — existing hooks keep working unchanged.

## Where

`src/hooks/handler.ts:122-145`

```ts
const fn: PolicyFunction = async (ctx): Promise<PolicyResult> => {
  try {
    const result = await Promise.race([
      hook.fn(ctx),
      new Promise<PolicyResult>((_, reject) =>
        setTimeout(() => reject(new Error("timeout")), 10_000),
      ),
    ]);
    return result;
  } catch (err) {
    const msg = err instanceof Error ? err.message : String(err);
    const isTimeout = msg === "timeout";
    hookLogWarn(`${prefix} hook "${hookName}" failed: ${msg}`);
    // ...telemetry...
    return { decision: "allow" };
  }
};
```

The `setTimeout` here only races against the hook's promise — it never signals to the hook that the result is now being thrown away.

## Why this matters

```mermaid
sequenceDiagram
    participant User as Claude Code
    participant Handler as handler.ts
    participant Hook as Custom hook fn
    participant API as Slow LLM / HTTP API

    User->>Handler: PreToolUse event
    Handler->>Hook: Invoke fn(ctx)
    Hook->>API: fetch(...) (no signal)
    Note over Handler: 10s timeout fires
    Handler->>Handler: Promise.race rejects with 'timeout'
    Handler-->>User: allow() (correct UX)
    Note over Hook,API: Hook's fetch keeps running ⏳
    API-->>Hook: Response 12s later
    Note over Hook: Result discarded but socket was open<br/>and tokens billed
```

Concrete cases where this hurts today:

- **Network-calling hooks.** A custom policy that calls an internal LLM safety API or a CI status endpoint will keep that HTTP request open after we've moved on. With keepalive + no abort, sockets stay alive and the upstream service still does the work — wasting LLM tokens, hitting rate limits, and skewing latency dashboards because we record "12s" on the upstream side and "10s" on ours.
- **Hooks that spawned children.** A hook that ran `execFile()` to lint something can leave a Bun / Node child process alive after timeout, since nothing kills it on the parent side.
- **Hooks holding intervals.** A buggy hook that did `setInterval(...)` (rare, but real) will keep firing until process exit. For one-shot CLI invocations this is short-lived, but for long-running flows (Claude Agent SDK sessions, the relay daemon spawn lineage), it adds up.
- **Tests are harder to write deterministically.** Without a signal, test authors who want to verify "my hook unwinds on timeout" have no contract to assert against.

This is also a natural pairing with #153 (named `CUSTOM_HOOK_TIMEOUT_MS` constant + env override): once the timeout is configurable, the hook *really* needs a way to react to it.

## Proposed enhancement

1. Extend `PolicyContext` with an optional `signal: AbortSignal`.
2. In `handler.ts`, build the timeout from `AbortSignal.timeout(CUSTOM_HOOK_TIMEOUT_MS)` and pass it into the hook's `ctx`.
3. Race the hook against the signal's abort — when the signal fires, the hook (if it cooperates) can short-circuit; we still return `allow()` regardless.

Sketch:

```ts
const controller = new AbortController();
const timer = setTimeout(() => controller.abort(new Error("timeout")), CUSTOM_HOOK_TIMEOUT_MS);
try {
  const ctxWithSignal = { ...ctx, signal: controller.signal };
  const result = await Promise.race([
    hook.fn(ctxWithSignal),
    new Promise<PolicyResult>((_, reject) =>
      controller.signal.addEventListener("abort", () => reject(controller.signal.reason)),
    ),
  ]);
  return result;
} finally {
  clearTimeout(timer);
}
```

For hook authors, this means they can write:

```ts
customPolicies.add({
  name: "slow-policy",
  match: { events: ["PreToolUse"] },
  fn: async (ctx) => {
    const resp = await fetch(URL, { signal: ctx.signal }); // <-- coop cancellation
    // ...
  },
});
```

Hooks that ignore `ctx.signal` keep working exactly as today — no behaviour change.

## Acceptance criteria

- [ ] `PolicyContext` exposes a `signal: AbortSignal` (typed in `src/hooks/policy-types.ts`).
- [ ] On timeout, the signal is aborted with reason `"timeout"` before the handler returns.
- [ ] `clearTimeout` runs on the success path so we don't leak timers when the hook resolves quickly.
- [ ] User-facing decision is unchanged: timeout still produces `decision: "allow"` and a single `hookLogWarn`.
- [ ] New unit test in `__tests__/hooks/handler.test.ts` verifies (a) the signal fires on timeout, (b) `clearTimeout` is called on success (no dangling timer in `process._getActiveHandles()`), (c) hooks that ignore the signal still get the same `allow()` outcome.
- [ ] Docs: short example in README under "Custom policies" showing `ctx.signal` use with `fetch`.
- [ ] CHANGELOG entry under `## Unreleased > Features`.

(Plays well with #153 — once the timeout is configurable, `ctx.signal` becomes the corresponding hook-side handle.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CLAUDE ROUTINE]: Reliability enhancement — propagate the 10s custom-hook timeout to the hook itself via `AbortSignal` so a slow hook stops doing work the moment we stop waiting for it #269

Summary

Where

Why this matters

Proposed enhancement

Acceptance criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[CLAUDE ROUTINE]: Reliability enhancement — propagate the 10s custom-hook timeout to the hook itself via AbortSignal so a slow hook stops doing work the moment we stop waiting for it #269

Description

Summary

Where

Why this matters

Proposed enhancement

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[CLAUDE ROUTINE]: Reliability enhancement — propagate the 10s custom-hook timeout to the hook itself via `AbortSignal` so a slow hook stops doing work the moment we stop waiting for it #269