QVAC-18181 feat[api]: introduce request lifecycle primitives with signal-based cancel#1949
Conversation
Tier-based Approval Status |
…nal-based cancel
Introduce DisposableScope, RequestContext, and RequestRegistry to formalize long-running request lifecycle management and route cancellation through AbortSignal semantics. Add client-generated requestId support on completion runs and a cancel({ requestId }) API, while preserving model-wide cancellation via registry-backed cancel({ modelId }).
This removes legacy cancel-counter bookkeeping in the llama.cpp completion path so KV-cache accounting reflects actual signal-driven cancellation behavior.
… and handle pre-aborted signals
Three small follow-ups based on PR review:
1. server/bare/ops/cancel.ts now falls back to `model.addon.cancel()` when the registry has no match. Only the llama.cpp completion handler is wired through `registry.begin(...)` in 0.11.0; embeddings / transcription / translation / decoder / OCR / TTS handlers will follow in later milestones. Without this fallback, `cancel({ operation: "embeddings", modelId })` and the other non-migrated surfaces would become silent no-ops post-PR.
2. completion-stream.ts re-invokes `onAbort` synchronously when `signal.aborted` is already true at register time. `addEventListener("abort", ..., { once: true })` does not fire for an already-aborted signal, and the registry can hand the completion handler a born-aborted signal when `parentSignal` was already aborted at `begin(...)`.
3. Soften docstring claims about cancel-before-roundtrip in client/api/cancel.ts, examples/cancel-by-request-id.ts, schemas/completion-stream.ts, schemas/completion-event.ts, and server/bare/runtime/request-id.ts. The `requestId` is surfaced synchronously, but a `cancel({ requestId })` issued in the same tick may race the begin and is logged as a no-match.
Bare runtime does not expose AbortController as a global, so the registry's `new AbortController()` call in `begin(...)` failed at runtime on every completion (`ReferenceError: AbortController is not defined`). Use the existing `bare-abort-controller` dep (already used by download-manager and rag-operation-manager) and add matching `AbortSignal` type imports to the two other new/touched files for consistency with the rest of the Bare-server codebase. This was missed because: - Unit tests run in Bun where AbortController is global. - TypeScript typechecks against lib.dom where AbortController is global. - tests-qvac (the only CI surface that loads into actual Bare) is skipping on this PR, so the integration path that exercises `registry.begin(...)` was never run.
The request-lifecycle stack depends on Symbol.asyncDispose for every `await using ctx = registry.begin(...)` site. If the host runtime doesn't expose it (older Bare/Expo build, missing polyfill), the `[Symbol.asyncDispose]:` property key in disposable-scope.ts silently coerces to the string "undefined" and produces objects that look async-disposable but are not — handlers would leak registry entries forever. Add an AsyncDisposeUnavailableError (code 53503) and a module-load guard in disposable-scope.ts that throws clearly at import time if the symbol is missing. Adds a tripwire unit test asserting `typeof Symbol.asyncDispose === "symbol"` so any regression in the Bun test runtime or future polyfill story fails the suite first. Follow-up to Yury's pitch review (Bare/Expo runtime support).
1529d34 to
a120c96
Compare
yuranich
left a comment
There was a problem hiding this comment.
Solid foundational work — primitives are well-shaped (LIFO disposal, error aggregation, idempotent dispose, parentSignal composition with explicit detachParent), and retiring the per-model cancel-counter for a single signal-driven source of truth is a clean win. Follow-up commits already cover the obvious Bare/Expo gotchas (AbortController import, Symbol.asyncDispose module-load guard, pre-aborted signal handling).
One blocking item: the M1 compat fallback in server/bare/ops/cancel.ts does void addon.cancel.call(addon) and cancelHandler doesn't await cancel(), so the cancel RPC for embeddings / transcription / translation / decoder / OCR / TTS now resolves before the addon has acknowledged — which is exactly the wire contract 513bb9e's commit message promised the fallback would preserve until M3b/c migrate those handlers. Two-line fix across the two files (see line comments).
Plus one design observation on the synchronous-requestId race (non-blocking, worth tracking for M2's typed cancel outcomes) and one listener-cleanup nit.
Test gaps worth filling on the way:
parentSignaldetachParentremoval (the listener-accumulation failure modedetachParentexists to prevent).- The documented same-tick cancel-before-begin race.
… listener + tests Addresses Yury's PR tetherto#1949 review comments: - `server/bare/ops/cancel.ts`: make `cancel()` async again and `await addon.cancel.call(addon)` in the M1 compat fallback. Dropping the await turned the cancel RPC for embeddings / transcription / translation / decoder / OCR / TTS into a fire-and-forget — the RPC resolved before the addon had acknowledged the cancel, breaking the wire contract 513bb9e promised the fallback would preserve until M3b/M3c migrate those handlers. - `server/rpc/handlers/cancelHandler.ts`: pair-fix — `await cancel(...)` in both `case "inference"` and `case "embeddings"` so the await actually propagates to the RPC response. Function is async now and returns the response object directly. - `server/bare/plugins/llamacpp-completion/ops/completion-stream.ts`: wrap the body in `try { ... } finally { signal.removeEventListener(...) }` so the abort listener is detached on every exit path (happy completion, thrown error, generator `return()`). `{ once: true }` already removes the listener if the signal fires; the finally is the cleanup hook for the signal-never-fired path — mirrors the registry's own `detachParent` discipline. - `test/unit/runtime/request-registry.test.ts`: two new tests filling the gaps Yury called out: - `end() detaches parent listener` — verifies via add/remove counters that a long-lived `parentSignal` doesn't accumulate listeners across many begin/end cycles. Pins the contract `detachParent` exists to enforce. - `same-tick cancel-before-begin` — pins the M1 contract that a `cancel({ requestId })` issued before its `begin(...)` lands on the server returns 0 and does not retroactively abort the later `begin(...)`. Documents the Stop-button race that M2's typed cancel outcomes will close. Lint, typecheck, and all 694 unit tests pass.
|
nice piece of work @simon-iribarren. my comments were mostly defensive, nothing blocking. |
…itives Resolves tetherto#1949 (comment) and tetherto#1949 (comment). Code: - cancelHandler: add exhaustive `default` to the operation switch so any future `CancelRequest` op that isn't added here surfaces as `success: false` with a descriptive error instead of silently returning `success: true`. The `never` assignment makes the drift a compile-time failure too. No-plain-Error rule respected — branch returns a typed `CancelResponse`, no `throw`. - completion-stream onAbort: replace `void addon.cancel.call(addon)` with `.catch(...)` so a rejection from the addon's cancel doesn't leak as an unhandledRejection. Listener stays best-effort fire-and-forget (event listeners can't await) but errors are now logged instead of swallowed. Docs: - `.cursor/rules/sdk/request-lifecycle-primitives.mdc` — auto-firing rule for `packages/sdk/server/bare/**`. Canonical handler shape, signal-leaf discipline, anti-patterns (no `if (signal.aborted) return` polling, no manual cancel cleanup, no AbortController passing, no plain Error throw), cancel API surface, primitives reference, error codes. - `.cursor/rules/sdk/docs/request-lifecycle-system.mdc` — full topic doc. Design rationale (what this replaces, why three primitives and not a framework, the build-vs-reuse calculus), migration roadmap (M1/M2/M3a-d), FAQ (singleton vs factory, caller-provided ids, `await using` semantics, runtime guard rationale, addon cancel relationship, known cancel-before-begin race). Both rules cross-link to `error-handling.mdc` and the existing `kv-cache-system.mdc` so agents have the full picture from any entry point.
QVAC E2E —
|
QVAC E2E —
|
|
/review |
🎯 What problem does this PR solve?
The SDK had no first-class request lifecycle primitive, so cancellation and cleanup logic was fragmented across features.
📝 How does it solve it?
DisposableScope(LIFO deferred cleanup, async dispose, error aggregation)RequestContextandRequestRegistry(begin/get/list/end/cancel/cancelAll)AbortSignal)requestIdon completion runs and propagates it through schemas/eventscancel({ requestId }); preservescancel({ modelId })broad-cancel semantics via registry routing🧪 How was it tested?
bun run lintbun run typecheckbun run buildbun run test:unit(668/668 passing)packages/sdk/test/unit/runtime/disposable-scope.test.tspackages/sdk/test/unit/runtime/request-registry.test.ts🔌 API Changes