Skip to content

QVAC-18181 feat[api]: introduce request lifecycle primitives with signal-based cancel#1949

Merged
simon-iribarren merged 9 commits into
tetherto:mainfrom
simon-iribarren:feat/qvac-18181-request-lifecycle-primitives
May 12, 2026
Merged

QVAC-18181 feat[api]: introduce request lifecycle primitives with signal-based cancel#1949
simon-iribarren merged 9 commits into
tetherto:mainfrom
simon-iribarren:feat/qvac-18181-request-lifecycle-primitives

Conversation

@simon-iribarren
Copy link
Copy Markdown
Contributor

🎯 What problem does this PR solve?

The SDK had no first-class request lifecycle primitive, so cancellation and cleanup logic was fragmented across features.

  • Made targeted cancellation hard (only broad model-wide cancel existed)
  • Let stale cancellation bookkeeping drift from real addon state
  • Increased risk of resource leaks and inconsistent KV-cache accounting in long-running completion streams

📝 How does it solve it?

  • Introduces runtime primitives for request lifecycle management:
    • DisposableScope (LIFO deferred cleanup, async dispose, error aggregation)
    • RequestContext and RequestRegistry (begin/get/list/end/cancel/cancelAll)
    • Worker singleton registry + server request-id generator fallback
  • Wires llama.cpp completion stream through registry-managed request contexts using signal-based cancellation (AbortSignal)
  • Adds client-generated requestId on completion runs and propagates it through schemas/events
  • Extends cancel API with cancel({ requestId }); preserves cancel({ modelId }) broad-cancel semantics via registry routing
  • Removes legacy cancel-counter tracking from KV-cache state and updates tests/examples accordingly

🧪 How was it tested?

  • bun run lint
  • bun run typecheck
  • bun run build
  • bun run test:unit (668/668 passing)
  • Added unit tests for runtime primitives:
    • packages/sdk/test/unit/runtime/disposable-scope.test.ts
    • packages/sdk/test/unit/runtime/request-registry.test.ts

🔌 API Changes

const run = sdk.completion({ model: "...", prompt: "..." });

// requestId is available immediately on the run handle
const requestId = run.requestId;

// new targeted cancellation path
await sdk.cancel({ requestId });

// existing broad-cancel escape hatch remains
await sdk.cancel({ modelId: "my-model-id" });

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 11, 2026

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ✅ APPROVED

**Requirements:**
- 1 Team Member approval ✅ (1/1)
- 1 Team Lead OR Management approval ✅ (1/1)



---
*This comment is automatically updated when reviews change.*

…nal-based cancel

Introduce DisposableScope, RequestContext, and RequestRegistry to formalize long-running request lifecycle management and route cancellation through AbortSignal semantics. Add client-generated requestId support on completion runs and a cancel({ requestId }) API, while preserving model-wide cancellation via registry-backed cancel({ modelId }).

This removes legacy cancel-counter bookkeeping in the llama.cpp completion path so KV-cache accounting reflects actual signal-driven cancellation behavior.
… and handle pre-aborted signals

Three small follow-ups based on PR review:

1. server/bare/ops/cancel.ts now falls back to `model.addon.cancel()` when the registry has no match. Only the llama.cpp completion handler is wired through `registry.begin(...)` in 0.11.0; embeddings / transcription / translation / decoder / OCR / TTS handlers will follow in later milestones. Without this fallback, `cancel({ operation: "embeddings", modelId })` and the other non-migrated surfaces would become silent no-ops post-PR.
2. completion-stream.ts re-invokes `onAbort` synchronously when `signal.aborted` is already true at register time. `addEventListener("abort", ..., { once: true })` does not fire for an already-aborted signal, and the registry can hand the completion handler a born-aborted signal when `parentSignal` was already aborted at `begin(...)`.
3. Soften docstring claims about cancel-before-roundtrip in client/api/cancel.ts, examples/cancel-by-request-id.ts, schemas/completion-stream.ts, schemas/completion-event.ts, and server/bare/runtime/request-id.ts. The `requestId` is surfaced synchronously, but a `cancel({ requestId })` issued in the same tick may race the begin and is logged as a no-match.
Bare runtime does not expose AbortController as a global, so the
registry's `new AbortController()` call in `begin(...)` failed at runtime
on every completion (`ReferenceError: AbortController is not defined`).
Use the existing `bare-abort-controller` dep (already used by
download-manager and rag-operation-manager) and add matching `AbortSignal`
type imports to the two other new/touched files for consistency with the
rest of the Bare-server codebase.

This was missed because:
- Unit tests run in Bun where AbortController is global.
- TypeScript typechecks against lib.dom where AbortController is global.
- tests-qvac (the only CI surface that loads into actual Bare) is
  skipping on this PR, so the integration path that exercises
  `registry.begin(...)` was never run.
The request-lifecycle stack depends on Symbol.asyncDispose for every `await using ctx = registry.begin(...)` site. If the host runtime doesn't expose it (older Bare/Expo build, missing polyfill), the `[Symbol.asyncDispose]:` property key in disposable-scope.ts silently coerces to the string "undefined" and produces objects that look async-disposable but are not — handlers would leak registry entries forever.

Add an AsyncDisposeUnavailableError (code 53503) and a module-load guard in disposable-scope.ts that throws clearly at import time if the symbol is missing. Adds a tripwire unit test asserting `typeof Symbol.asyncDispose === "symbol"` so any regression in the Bun test runtime or future polyfill story fails the suite first.

Follow-up to Yury's pitch review (Bare/Expo runtime support).
@simon-iribarren simon-iribarren force-pushed the feat/qvac-18181-request-lifecycle-primitives branch from 1529d34 to a120c96 Compare May 11, 2026 19:42
Copy link
Copy Markdown
Contributor

@yuranich yuranich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid foundational work — primitives are well-shaped (LIFO disposal, error aggregation, idempotent dispose, parentSignal composition with explicit detachParent), and retiring the per-model cancel-counter for a single signal-driven source of truth is a clean win. Follow-up commits already cover the obvious Bare/Expo gotchas (AbortController import, Symbol.asyncDispose module-load guard, pre-aborted signal handling).

One blocking item: the M1 compat fallback in server/bare/ops/cancel.ts does void addon.cancel.call(addon) and cancelHandler doesn't await cancel(), so the cancel RPC for embeddings / transcription / translation / decoder / OCR / TTS now resolves before the addon has acknowledged — which is exactly the wire contract 513bb9e's commit message promised the fallback would preserve until M3b/c migrate those handlers. Two-line fix across the two files (see line comments).

Plus one design observation on the synchronous-requestId race (non-blocking, worth tracking for M2's typed cancel outcomes) and one listener-cleanup nit.

Test gaps worth filling on the way:

  • parentSignal detachParent removal (the listener-accumulation failure mode detachParent exists to prevent).
  • The documented same-tick cancel-before-begin race.

Comment thread packages/sdk/server/bare/ops/cancel.ts Outdated
Comment thread packages/sdk/server/rpc/handlers/cancelHandler.ts Outdated
Comment thread packages/sdk/server/bare/runtime/request-registry.ts
… listener + tests

Addresses Yury's PR tetherto#1949 review comments:

- `server/bare/ops/cancel.ts`: make `cancel()` async again and `await
  addon.cancel.call(addon)` in the M1 compat fallback. Dropping the
  await turned the cancel RPC for embeddings / transcription /
  translation / decoder / OCR / TTS into a fire-and-forget — the RPC
  resolved before the addon had acknowledged the cancel, breaking the
  wire contract 513bb9e promised the fallback would preserve until
  M3b/M3c migrate those handlers.

- `server/rpc/handlers/cancelHandler.ts`: pair-fix — `await cancel(...)`
  in both `case "inference"` and `case "embeddings"` so the await
  actually propagates to the RPC response. Function is async now and
  returns the response object directly.

- `server/bare/plugins/llamacpp-completion/ops/completion-stream.ts`:
  wrap the body in `try { ... } finally { signal.removeEventListener(...) }`
  so the abort listener is detached on every exit path (happy
  completion, thrown error, generator `return()`). `{ once: true }`
  already removes the listener if the signal fires; the finally is the
  cleanup hook for the signal-never-fired path — mirrors the registry's
  own `detachParent` discipline.

- `test/unit/runtime/request-registry.test.ts`: two new tests filling
  the gaps Yury called out:
  - `end() detaches parent listener` — verifies via add/remove counters
    that a long-lived `parentSignal` doesn't accumulate listeners across
    many begin/end cycles. Pins the contract `detachParent` exists to
    enforce.
  - `same-tick cancel-before-begin` — pins the M1 contract that a
    `cancel({ requestId })` issued before its `begin(...)` lands on the
    server returns 0 and does not retroactively abort the later
    `begin(...)`. Documents the Stop-button race that M2's typed cancel
    outcomes will close.

Lint, typecheck, and all 694 unit tests pass.
Comment thread packages/sdk/server/rpc/handlers/cancelHandler.ts
Comment thread packages/sdk/server/bare/plugins/llamacpp-completion/ops/completion-stream.ts Outdated
@opaninakuffo
Copy link
Copy Markdown
Contributor

nice piece of work @simon-iribarren. RequestRegistry + DisposableScope is the right primitive to land before embeddings/transcribe/etc. get migrated, and retiring the per-model cancel counter for a real AbortSignal reads much cleaner. doc comments throughout do a great job explaining the why, especially the module-load guard on Symbol.asyncDispose and the registry-miss fallback in server/bare/ops/cancel.ts. wire compatibility (requestId optional + server-generated fallback) is thoughtful too.

my comments were mostly defensive, nothing blocking.

…itives

Resolves tetherto#1949 (comment)
and tetherto#1949 (comment).

Code:
- cancelHandler: add exhaustive `default` to the operation switch so any
  future `CancelRequest` op that isn't added here surfaces as
  `success: false` with a descriptive error instead of silently
  returning `success: true`. The `never` assignment makes the drift a
  compile-time failure too. No-plain-Error rule respected — branch
  returns a typed `CancelResponse`, no `throw`.
- completion-stream onAbort: replace `void addon.cancel.call(addon)`
  with `.catch(...)` so a rejection from the addon's cancel doesn't
  leak as an unhandledRejection. Listener stays best-effort
  fire-and-forget (event listeners can't await) but errors are now
  logged instead of swallowed.

Docs:
- `.cursor/rules/sdk/request-lifecycle-primitives.mdc` — auto-firing
  rule for `packages/sdk/server/bare/**`. Canonical handler shape,
  signal-leaf discipline, anti-patterns (no `if (signal.aborted)
  return` polling, no manual cancel cleanup, no AbortController
  passing, no plain Error throw), cancel API surface, primitives
  reference, error codes.
- `.cursor/rules/sdk/docs/request-lifecycle-system.mdc` — full topic
  doc. Design rationale (what this replaces, why three primitives and
  not a framework, the build-vs-reuse calculus), migration roadmap
  (M1/M2/M3a-d), FAQ (singleton vs factory, caller-provided ids,
  `await using` semantics, runtime guard rationale, addon cancel
  relationship, known cancel-before-begin race).

Both rules cross-link to `error-handling.mdc` and the existing
`kv-cache-system.mdc` so agents have the full picture from any entry
point.
Copy link
Copy Markdown
Contributor

@yuranich yuranich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All comments addressed in 709008e + e70c46e — the addon-cancel await is restored on both layers, the listener-cleanup try/finally is in, the same-tick race is tracked for M2 with a tripwire test on the M1 contract, and the parentSignal detach test is added. Thanks for the careful follow-ups.

@simon-iribarren simon-iribarren added the e2e-tested Test suite has run on this PR. Does not indicate tests pass/fail - see results in comments. label May 12, 2026
@simon-iribarren simon-iribarren added test-e2e-smoke Triggers smoke e2e test suite [Currently SDK-only] and removed e2e-tested Test suite has run on this PR. Does not indicate tests pass/fail - see results in comments. labels May 12, 2026
@github-actions
Copy link
Copy Markdown

QVAC E2E — ios — running

View run
Config: suite=smoke · filter=(none) · exclude=(none)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 12, 2026

QVAC E2E — windows — ✅ all tests passed (91/91, 461s)

Config: suite=smoke · filter=(none) · exclude=(none)
View run · Artifacts: reports

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 12, 2026

QVAC E2E — linux — ✅ all tests passed (91/91, 361s)

Config: suite=smoke · filter=(none) · exclude=(none)
View run · Artifacts: reports

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 12, 2026

QVAC E2E — macos — ✅ all tests passed (91/91, 327s)

Config: suite=smoke · filter=(none) · exclude=(none)
View run · Artifacts: reports

@github-actions
Copy link
Copy Markdown

QVAC E2E — android — running

View run
Config: suite=smoke · filter=(none) · exclude=(none)

@simon-iribarren
Copy link
Copy Markdown
Contributor Author

/review

@simon-iribarren simon-iribarren merged commit c5802fb into tetherto:main May 12, 2026
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test-e2e-smoke Triggers smoke e2e test suite [Currently SDK-only] tier1 verify

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants