Commit 186898d
authored
feat(completions): implement legacy /v1/completions endpoint (#683)
* feat(completions): implement legacy /v1/completions endpoint
Replace the NOT_IMPLEMENTED stub with a working text-completions handler.
The prompt is translated into a single user chat message, sent through the
existing completion service, and the chat result is reshaped back into the
OpenAI `object: "text_completion"` format. Supports both streaming (SSE
text deltas + [DONE]) and non-streaming, carries the Inference-Id header
and usage passthrough (including cached_tokens), and registers the route
and OpenAPI path. Adds unit tests for the response/chunk transforms.
Note: because the prompt is sent as a chat message the backend applies its
chat template, and the synthesized response is not byte-verifiable against
attestation (unlike /v1/chat/completions).
* fix(completions): address review on /v1/completions
- OpenAPI: response body is CompletionResponse (not ChatCompletionResponse);
derive ToSchema on CompletionResponse/CompletionChoice and register them.
- CompletionRequest.extra was missing #[serde(flatten)], so a normal request
({"model","prompt"}) failed to deserialize and unknown fields were dropped.
- Reject E2E-encryption headers and auto-redact opt-in with 400 instead of
silently bypassing them (the response reshape is incompatible with passing
the provider's encrypted/un-redacted bytes through).
- Warn when the Inference-Id can't be extracted from the first stream chunk,
matching chat_completions.
- Add deserialize tests for the flatten fix.
* fix(completions): handle advertised legacy params on /v1/completions
Per review: CompletionRequest advertises echo/logprobs/best_of/presence_
penalty/frequency_penalty but the endpoint silently dropped them.
- Forward presence_penalty/frequency_penalty to the provider via `extra`
(standard sampling params the chat backend accepts; no typed slot on the
service request).
- Reject echo / logprobs / best_of>1 with 400 unsupported_parameter — they
have no equivalent under the translate-to-chat path.
- Add tests for the rejection helper and penalty forwarding.
* fix(completions): accept OpenAI one-or-many prompt/stop shapes
Per review: {"stop":"\n"} and {"prompt":["a","b"]} failed JSON extraction
(framework 422) before the handler could return a clean error.
- `stop` now accepts string or array (StopSequences), normalized to Vec.
- `prompt` now accepts string / string[] / int[] / int[][] (CompletionPrompt)
so all OpenAI shapes deserialize; single-string is served, batch and
token-id prompts are rejected with 400 unsupported_parameter (no mapping
under translate-to-chat).
- Remove the now-unused (and now type-incompatible) From<CompletionRequest>
for CompletionParams; the native text-completion path it fed is never wired.
- Register the new schemas in OpenAPI; add deserialization tests for the
one-or-many shapes.1 parent 9575972 commit 186898d
5 files changed
Lines changed: 642 additions & 75 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | | - | |
129 | | - | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
138 | | - | |
139 | | - | |
140 | | - | |
141 | | - | |
142 | | - | |
143 | | - | |
144 | | - | |
145 | | - | |
146 | | - | |
147 | 122 | | |
148 | 123 | | |
149 | 124 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | | - | |
| 21 | + | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
1057 | 1057 | | |
1058 | 1058 | | |
1059 | 1059 | | |
| 1060 | + | |
1060 | 1061 | | |
1061 | 1062 | | |
1062 | 1063 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
173 | 173 | | |
174 | 174 | | |
175 | 175 | | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
176 | 223 | | |
177 | 224 | | |
178 | 225 | | |
179 | | - | |
| 226 | + | |
180 | 227 | | |
181 | 228 | | |
182 | 229 | | |
| |||
187 | 234 | | |
188 | 235 | | |
189 | 236 | | |
190 | | - | |
| 237 | + | |
191 | 238 | | |
192 | 239 | | |
193 | 240 | | |
194 | 241 | | |
| 242 | + | |
195 | 243 | | |
196 | 244 | | |
197 | 245 | | |
198 | | - | |
| 246 | + | |
199 | 247 | | |
200 | 248 | | |
201 | 249 | | |
| |||
792 | 840 | | |
793 | 841 | | |
794 | 842 | | |
795 | | - | |
| 843 | + | |
796 | 844 | | |
797 | 845 | | |
798 | 846 | | |
| |||
1016 | 1064 | | |
1017 | 1065 | | |
1018 | 1066 | | |
1019 | | - | |
1020 | | - | |
1021 | | - | |
| 1067 | + | |
| 1068 | + | |
1022 | 1069 | | |
1023 | 1070 | | |
1024 | 1071 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
54 | | - | |
| 54 | + | |
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| |||
165 | 165 | | |
166 | 166 | | |
167 | 167 | | |
168 | | - | |
| 168 | + | |
| 169 | + | |
169 | 170 | | |
170 | 171 | | |
171 | 172 | | |
| |||
0 commit comments