Skip to content

Commit be5bd94

Browse files
committed
review(drip-171): batch 2 — litellm + gemini-cli + qwen-code + goose (5 PRs)
- BerriAI/litellm#26764 merge-after-nits: multipart user_config/tags JSON decode - BerriAI/litellm#26769 request-changes: FuturMix add bundles unrelated deletions - google-gemini/gemini-cli#26184 merge-after-nits: deleteSession fail-loud on missing file - QwenLM/qwen-code#3737 merge-after-nits: drop strip-thoughts helpers, preserve reasoning_content - aaif-goose/goose#8781 needs-discussion: ACP per-connection inline→spawn ordering question
1 parent dfacd42 commit be5bd94

5 files changed

Lines changed: 549 additions & 0 deletions

File tree

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# BerriAI/litellm#26764`fix(proxy): decode multipart user_config JSON`
2+
3+
- URL: https://github.com/BerriAI/litellm/pull/26764
4+
- Head SHA: `b63721c45f28`
5+
- Author: Kcstring
6+
- Size: +69 / -2 across 2 files
7+
- Fixes: #26707
8+
9+
## Summary
10+
11+
Multipart proxy requests previously only auto-decoded the `metadata`
12+
form field as JSON. `user_config` (per-request override of the
13+
router config) and `tags` arrays were left as raw strings, so a
14+
client sending the canonical multipart form for image-edit calls
15+
hit `TypeError: string indices must be integers` deep in routing.
16+
This PR generalizes the field-by-field JSON decode and adds two
17+
focused unit tests.
18+
19+
## Specific references
20+
21+
`litellm/proxy/common_utils/http_parsing_utils.py`:
22+
23+
- New helper `_parse_form_json_field(parsed_body, field_name, *, require_json=False)`
24+
at lines 13–22:
25+
26+
```py
27+
if not isinstance(value, str):
28+
return
29+
if require_json or value.lstrip().startswith(("{", "[")):
30+
parsed_body[field_name] = json.loads(value)
31+
```
32+
33+
- Replacement at lines 49–54 (form branch of `_read_request_body`):
34+
35+
```py
36+
parsed_body = dict(await request.form())
37+
_parse_form_json_field(parsed_body, "metadata", require_json=True)
38+
_parse_form_json_field(parsed_body, "user_config")
39+
_parse_form_json_field(parsed_body, "tags")
40+
```
41+
42+
`tests/test_litellm/proxy/common_utils/test_multipart_json_form_fields.py` (new):
43+
44+
- `test_form_data_with_json_user_config_and_tags` — verifies a
45+
full `user_config` model_list payload plus a JSON-array `tags`
46+
string both round-trip into native objects.
47+
- `test_form_data_with_plain_string_tags_is_left_unchanged`
48+
pins the back-compat carve-out for clients that send `tags`
49+
as a single bare string ("production").
50+
51+
## Design analysis
52+
53+
The `require_json=True` carve-out for `metadata` preserves the
54+
old "always parse, raise on bad JSON" behavior, while the new
55+
heuristic for `user_config`/`tags` only parses when the value
56+
*looks* like JSON (`{` or `[` prefix after stripping). That's the
57+
right shape: a client that sends `tags=production` keeps working,
58+
a client that sends `tags=["a","b"]` gets the array.
59+
60+
Two minor concerns about the heuristic:
61+
62+
1. A legitimate `user_config` value that is a JSON *string*
63+
(e.g. literally `"foo"` with quotes) would not start with `{`
64+
or `[`, so it stays a string. Probably fine because
65+
`user_config` is documented to be an object, but worth a
66+
one-line comment in the helper.
67+
2. `require_json=False` swallows the parse only by virtue of the
68+
prefix check. If the prefix matches but the body is malformed
69+
JSON (e.g. `tags=[broken`), `json.loads` will still raise.
70+
That's actually the correct fail-loud behavior, but the test
71+
suite doesn't pin it. A `test_invalid_json_tags_raises` would
72+
document that contract.
73+
74+
## Risks
75+
76+
- Low. The change is additive and the existing `metadata`
77+
behavior is preserved verbatim via `require_json=True`.
78+
- Watch for any callers downstream of `_read_request_body` that
79+
previously called `json.loads(parsed_body["user_config"])`
80+
themselves; they now receive a dict and would crash. The diff
81+
doesn't show any such call sites, but a quick
82+
`rg 'parsed_body\["user_config"\]'` and `rg 'body\["user_config"\]'`
83+
in proxy code is the right confidence check.
84+
85+
## Verdict
86+
87+
`merge-after-nits`
88+
89+
## Suggested nits
90+
91+
- Add a comment on `_parse_form_json_field` explaining the prefix
92+
heuristic and the back-compat goal.
93+
- Add a third test asserting malformed JSON raises (pins the
94+
fail-loud behavior).
95+
- Run `rg 'parsed_body\["user_config"\]'` to confirm no
96+
downstream caller still does its own decode.
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
# BerriAI/litellm#26769`feat: add FuturMix as named OpenAI-compatible provider`
2+
3+
- URL: https://github.com/BerriAI/litellm/pull/26769
4+
- Head SHA: `6bd3935d3a60`
5+
- Author: Gzhao-redpoint
6+
- Size: +25 / -19 across 2 files
7+
8+
## Summary
9+
10+
Registers `futurmix` as a named OpenAI-compatible provider in
11+
`litellm/llms/openai_like/providers.json`, with a corresponding
12+
endpoint-support entry in `provider_endpoints_support.json`. The
13+
provider declares chat/completions, messages, responses,
14+
embeddings, and image generations as supported, with the standard
15+
audio/moderations/batches/rerank disabled.
16+
17+
## Specific references
18+
19+
`litellm/llms/openai_like/providers.json` lines ~80–90 (after the
20+
`gmi` entry):
21+
22+
```json
23+
"futurmix": {
24+
"base_url": "https://futurmix.ai/v1",
25+
"api_key_env": "FUTURMIX_API_KEY",
26+
"api_base_env": "FUTURMIX_API_BASE",
27+
"param_mappings": {
28+
"max_completion_tokens": "max_tokens"
29+
}
30+
}
31+
```
32+
33+
`provider_endpoints_support.json`:
34+
35+
- New `futurmix` entry around line 1063 with the standard
36+
endpoint matrix.
37+
- **Unrelated deletions** in the same file:
38+
- Removed `"a2a": false` and `"interactions": false` keys from
39+
an existing provider entry near lines 471–477 (visible at
40+
diff lines 31–34).
41+
- Removed the entire `charity_engine` provider entry at lines
42+
2553–2570 (visible at diff lines 70–84).
43+
44+
## Concerns
45+
46+
The two deletions noted above are unrelated to "add FuturMix" and
47+
should not be in this PR. They look like merge-conflict resolution
48+
or a stale local checkout. Without a justification in the PR body
49+
they are silent regressions:
50+
51+
- Removing `charity_engine` entirely deletes a previously
52+
supported provider's discoverability metadata. If charity_engine
53+
was deprecated, that should be its own PR with a changelog
54+
entry.
55+
- Removing the `a2a` and `interactions` flags from another
56+
provider changes the schema-shape for that provider only,
57+
causing schema heterogeneity across the file.
58+
59+
The FuturMix-only changes are otherwise straightforward and match
60+
the structure of neighboring entries (e.g. `gmi`, `sarvam`).
61+
62+
## Risks
63+
64+
- The `param_mappings: { "max_completion_tokens": "max_tokens" }`
65+
is correct for an OpenAI-compatible upstream that uses the
66+
legacy `max_tokens` field. Sanity-check: confirm FuturMix's
67+
public docs actually require `max_tokens` not
68+
`max_completion_tokens`. If they accept both, the mapping is
69+
fine but redundant; if they only accept `max_completion_tokens`,
70+
this mapping would break the call.
71+
- No model list / pricing entry is added in `model_prices_and_context_window.json`,
72+
so cost tracking will fall back to defaults. That's fine for a
73+
provider-only PR but worth flagging in the description.
74+
- No tests, but provider-registration PRs in this repo
75+
conventionally don't add unit tests.
76+
77+
## Verdict
78+
79+
`request-changes`
80+
81+
Reason: the unrelated deletions in `provider_endpoints_support.json`
82+
(charity_engine entry + `a2a`/`interactions` keys) need to be
83+
reverted from this PR. Submit them separately if intentional.
84+
85+
## Suggested nits
86+
87+
- Confirm in the PR body that FuturMix's API expects `max_tokens`
88+
(not `max_completion_tokens`).
89+
- Optional: add a docs page under `docs/my-website/docs/providers/futurmix.md`
90+
matching the URL declared in `provider_endpoints_support.json`.
Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# QwenLM/qwen-code#3737`fix(core): preserve reasoning_content in rewind, compression, and merge paths (#3579)`
2+
3+
- URL: https://github.com/QwenLM/qwen-code/pull/3737
4+
- Head SHA: `480f9e71b4c4`
5+
- Author: fyc09
6+
- Size: +48 / -363 across 8 files
7+
- Closes the #3579 series (after #3590, #3682)
8+
9+
## Summary
10+
11+
Three remaining paths still silently dropped DeepSeek-class
12+
`reasoning_content` from history:
13+
14+
1. **Rewind / double-ESC**`AppContainer.tsx` called
15+
`geminiClient.stripThoughtsFromHistory()` after truncating
16+
to the rewind point.
17+
2. **Chat compression**`chatCompressionService.ts` built the
18+
acknowledgment model turn with no thought part, so converters
19+
that key off the presence of *any* thought in a model turn
20+
broke the reasoning chain across the compression boundary.
21+
3. **Idle/merge** — the `stripThoughtsFromHistory` and
22+
`stripThoughtsFromHistoryKeepRecent` helpers themselves were
23+
load-bearing for a use case the project no longer wants to
24+
support; this PR retires them outright.
25+
26+
## Specific references
27+
28+
`packages/cli/src/ui/AppContainer.tsx:1739-1745` (post-patch):
29+
30+
```tsx
31+
- // 3. Truncate API history and strip stale thinking blocks
32+
+ // 3. Truncate API history to the target point.
33+
+ // Do NOT strip thought parts — reasoning models (e.g. DeepSeek)
34+
+ // require reasoning_content continuity across all turns ...
35+
geminiClient.truncateHistory(apiTruncateIndex);
36+
- geminiClient.stripThoughtsFromHistory();
37+
```
38+
39+
`packages/core/src/core/client.ts:204-210` removes the
40+
`stripThoughtsFromHistory()` pass-through on `GeminiClient`.
41+
42+
`packages/core/src/core/geminiChat.ts:792-919` (the deletion block)
43+
removes both `stripThoughtsFromHistory()` and
44+
`stripThoughtsFromHistoryKeepRecent(keepTurns)` from `GeminiChat`,
45+
and `geminiChat.test.ts:2039-2227` removes their dedicated tests.
46+
This is a real public-API surface reduction.
47+
48+
`packages/core/src/services/chatCompressionService.ts:265-292`
49+
the new logic detects whether any preserved model turn carries
50+
`thought: true` and, if so, appends a sentinel thought part to
51+
the compression acknowledgment model turn:
52+
53+
```ts
54+
const hasThoughtContentInKeep = historyToKeep.some(
55+
(content) =>
56+
content.role === 'model' &&
57+
content.parts?.some((part) => 'thought' in part && part.thought),
58+
);
59+
60+
const ackParts: Part[] = [
61+
{ text: 'Got it. Thanks for the additional context!' },
62+
];
63+
if (hasThoughtContentInKeep) {
64+
ackParts.push({ text: ' ', thought: true });
65+
}
66+
```
67+
68+
## Design analysis
69+
70+
The PR implements a clean policy reversal: instead of stripping
71+
thoughts on three control paths (rewind, compression-ack, idle),
72+
preserve them and remove the strippers entirely. That matches
73+
the framing in #3579 — DeepSeek-class providers require an
74+
unbroken `reasoning_content` chain across the *entire* assistant
75+
sequence, so any helper that selectively drops thoughts is a
76+
loaded gun.
77+
78+
Two design points worth noting:
79+
80+
1. **Sentinel thought part with `text: ' '`.** The acknowledgment
81+
only carries a thought part *when* prior turns had one. The
82+
chosen sentinel is a single-space text with `thought: true`.
83+
That works for downstream converters that key off "is there a
84+
thought part on this turn" — but it's brittle if any consumer
85+
ever asserts non-empty thought content. A short comment
86+
right above the `ackParts.push` line explaining "sentinel,
87+
intentionally minimal" would help future maintainers.
88+
2. **API surface removal vs deprecation.** Deleting public
89+
methods on `GeminiClient` and `GeminiChat` is fine for an
90+
internal/CLI-only surface, but if anything in
91+
`packages/cli`-adjacent extension code calls them, this is
92+
a hard break. A `rg "stripThoughtsFromHistory"` across the
93+
whole repo (already implied by the test cleanup) plus a note
94+
in the PR body would close the loop.
95+
96+
The test cleanup is mechanical: every `stripThoughtsFromHistory: vi.fn()`
97+
mock entry is removed across `client.test.ts`, `config.test.ts`,
98+
and `geminiChat.test.ts`. No assertion logic for the new behavior
99+
was added in `geminiChat.test.ts` itself; the new behavior is
100+
exercised through `chatCompressionService` (presumed to have its
101+
own tests, not visible in the diff).
102+
103+
## Risks
104+
105+
- Compression test coverage: the diff doesn't show a new test in
106+
`chatCompressionService.test.ts` pinning the
107+
`hasThoughtContentInKeep ? thought-part-present : not` branch.
108+
This is the only new code path; it should have a direct test.
109+
- The `text: ' '` sentinel is invisible in user-facing logs but
110+
may surface in trace dumps. Worth confirming no downstream
111+
formatter trims whitespace-only text and drops the thought part
112+
with it.
113+
- Prior fixes #3590 and #3682 closed the resume/idle and
114+
model-switch paths. After this PR, any *new* control path that
115+
wants to mutate history needs to learn this lesson without the
116+
guardrail of helpers — the helpers are gone. A short
117+
`CHANGELOG`/comment block in `geminiChat.ts` explaining
118+
*why* these helpers no longer exist would prevent re-introduction.
119+
120+
## Verdict
121+
122+
`merge-after-nits`
123+
124+
## Suggested nits
125+
126+
- Add at least one test in `chatCompressionService.test.ts`
127+
pinning both branches of the `hasThoughtContentInKeep` check.
128+
- Add a short tombstone comment in `geminiChat.ts` (where the
129+
methods used to live) saying "intentionally removed in #3737
130+
— reasoning_content must remain contiguous; do not reintroduce".
131+
- Confirm in PR body that `rg stripThoughtsFromHistory` returns
132+
zero hits across the whole repo after this PR.

0 commit comments

Comments
 (0)