Commit 0741eb5
authored
QVAC-18733 feat[api]: add openai responses routes with in-memory store (#2030)
* QVAC-18733 feat[api]: add OpenAI Responses routes with in-memory store
Implement POST /v1/responses (blocking + SSE), GET/DELETE /v1/responses/{id},
GET /v1/responses/{id}/input_items, previous_response_id chaining, LRU+TTL
store, X-QVAC-Stub: responses-volatile header, and startup banner.
* fix: align Responses streaming with finalized response and add usage stats
- Approach (b): always include the assistant `message` item in `response.output[0]`,
even when tool calls are present, so the streamed item tree matches `response.completed`.
- Pre-allocate `msgItemId` and `fcItemIds` once and reuse them across SSE events and
the finalized `output[]`, fixing client-side accumulation by `item_id`.
- Use distinct `output_index` per tool call (1..n) and set `item_id` on
`response.function_call_arguments.delta`/`.done` to the function-call item id
(was the OpenAI `call_id`, causing collisions and wrong wiring).
- Populate `required_action.submit_tool_outputs.tool_calls` so OpenAI clients can
satisfy tool calls instead of hanging in `requires_action` with no payload.
- Drop the duplicate `previous_response_id` lookup in `handlePostResponses`.
- Drop `parallel_tool_calls` from the unsupported-params log: it is honored.
- Recognise `function_call_output` (-> `tool` role) and `function_call`
(-> synthesized assistant `<tool_call>` content) in
`openaiResponsesInputToHistory` and `historyPrefixFromStoredResponse` so chained
tool round-trips actually carry through `previous_response_id`.
- Use `crypto.randomUUID()` for `resp_`/`msg_`/`fc_`/input-item ids.
- Surface real `usage.output_tokens` from `result.stats.generatedTokens`
(Responses + chat.completions, blocking + streaming); fall back to word count
when stats are missing. `input_tokens` stays 0 with an inline note that the SDK
does not expose a prompt-token count today.
- Tighten `CompletionResult.stats` to a structured `CompletionRunStats` shape.
Tests: extend `responses.test.ts` and `translate.test.ts`; add
`responses-streaming.test.ts` driving the new exported `writeStreamingResponse` /
`writeBlockingResponse` helpers with a fake `CompletionResult` and `ServerResponse`.
* test[skiplog]: stabilize Responses chain e2e for tiny reasoning model
Pin temperature=0 + seed and bump max_output_tokens to 512 so Qwen3-600M
has room for both its <think> block and the actual answer. The test
exercises previous_response_id chain wiring; it should not depend on
sampling luck or the model's reasoning length.
* fix: walk previous_response_id chain so multi-turn keeps grandparent history
Each StoredResponse.inputItems only carries that turn's NEW input
(`normalizeResponsesInputItemsForStorage(body['input'])`), so a chain of
depth >= 3 silently lost the grandparent turn:
resp_1 input "A" -> output "X" (stored: ["A"])
resp_2 prev=resp_1 input "B" history sent: [A, X, B]
(stored: ["B"])
resp_3 prev=resp_2 input "C" history sent: [B, Y, C] -- A and X gone
historyPrefixFromStoredResponse now walks the chain via
responseObject.previous_response_id when given a resolver, prepending
earlier turns oldest-first. Cap depth at 32 to bound work and protect
against pathological cycles. Routes pass `(id) => store.get(id)` as the
resolver. Legacy single-step callers still work unchanged when the
resolver is omitted.
Tests:
- unit: depth-3 chain produces all six prefix entries in order; maxDepth
cap honored.
- e2e: resp_1 sets "code word is XYZZY", resp_2 acks, resp_3 asks for the
word and recovers it -- would silently fail before this fix.
* fix: address Responses review nits (SSE sentinel, dup event, types, max_tokens warn, README)
Five low-severity items from PR #2030 review:
- Drop the `data: [DONE]` sentinel on `/v1/responses` SSE: spec ends on
`response.completed`. Adds an `EndSSEOptions { sentinel?: boolean }`
knob to `endSSE` so chat-completions keeps its existing sentinel and
Responses opts out via `endSSE(res, { sentinel: false })`. E2E flips
the assertion accordingly.
- Drop the duplicate `response.in_progress` event emitted back-to-back
with `response.created` (same payload, no state transition — strict
parsers can choke).
- Tighten `BuildResponseObjectParams.parallelToolCalls` from
`boolean | undefined` to `boolean` (the route already resolves a
default before calling), eliminating a dead `?? true` fallback.
- Warn on `max_tokens` for /v1/responses (spec field is
`max_output_tokens`); still accepted as a fallback so existing clients
don't break, but they get a logger.warn nudge.
- README: add a "serve openai" section listing all routes and a
Responses subsection that documents volatility, the
`X-QVAC-Stub` header, the `store: false` opt-out, and curl examples.
The README previously listed no openai-compat endpoints at all.
Skipped from the review:
- #2 (no client-disconnect handling in streaming): pre-existing gap
shared with /v1/chat/completions, reviewer marked out of scope.
- #7 (per-entry byte-size cap on the in-memory store): reviewer marked
follow-up; `maxEntries` + TTL still bound memory pressure for the
local-first single-user target audience.
* fix: address Simon review nits (stream error sentinel, input_items after cursor)
Two surfaced post-rebase:
1. sendError gained an opt-in { sseSentinel: false } so callers inside an
active stream can suppress the trailing `data: [DONE]\n\n` after the
`response.error` SSE event. Responses streaming error path now passes
it, closing the gap that the happy path already handled (response.completed
already used endSSE({ sentinel: false })).
2. GET /v1/responses/:id/input_items now reads the `after` cursor from the
query string in addition to `limit`. Spec-compliant pagination would have
re-fetched page 1 forever; the store already implemented the cursor.
Added a store-level pagination test that walks all pages by `last_id`.1 parent 93984e0 commit 0741eb5
18 files changed
Lines changed: 2260 additions & 30 deletions
File tree
- packages/cli
- docs
- src/serve
- adapters
- openai
- routes
- core
- test
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
306 | 306 | | |
307 | 307 | | |
308 | 308 | | |
309 | | - | |
| 309 | + | |
310 | 310 | | |
311 | 311 | | |
312 | 312 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
15 | 19 | | |
16 | 20 | | |
17 | 21 | | |
18 | 22 | | |
19 | 23 | | |
20 | 24 | | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
21 | 62 | | |
22 | 63 | | |
23 | 64 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
35 | 47 | | |
36 | 48 | | |
37 | 49 | | |
| |||
Lines changed: 127 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
Lines changed: 142 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
0 commit comments