fix(openai-shim): restore reasoning support for proxy and localhost providers#1155
fix(openai-shim): restore reasoning support for proxy and localhost providers#1155JoaoPedroCampanari wants to merge 10 commits into
Conversation
jatmn
left a comment
There was a problem hiding this comment.
Findings
- [P2] Keep non-empty reasoning_content when coalescing assistant messages
src/services/api/openaiShim.ts:762
The new coalescing logic overwrites an existing non-emptyreasoning_contentwhenever the following assistant message has the empty-string fallback. For example, a Moonshot/Kimi history with an assistant tool call that includes a thinking block, followed by another assistant text message before the tool result, is converted into one assistant message whosereasoning_contentis""instead of the original thinking text. That still sends a tool-call assistant message without the reasoning continuity these providers require, so the PR's target proxy/reasoning workflow can still hit the 400 this is meant to fix. Please keep the previous non-empty value when the incoming value is only the fallback, or otherwise merge in a way that does not discard the real reasoning text.
jatmn
left a comment
There was a problem hiding this comment.
Thanks for following up on the earlier review. The requested reasoning_content coalescing fix looks addressed now.
No remaining issues here, LGTM.
techbrewboss
left a comment
There was a problem hiding this comment.
Review summary
Thanks for the follow-up here. The dead requireReasoningContentOnAssistantMessages path is now wired into the shim, and the prior coalescing issue looks fixed. I found one remaining scope/correctness gap around the advertised local-proxy provider coverage.
Findings
src/integrations/runtimeMetadata.ts:218- Local/proxy inference does not cover advertised MiMo or Z.AI/GLM models.
Impact: The PR says localhost/proxy setups like GRouter work for models such asmimo-v2.5-proand also lists Z.AI, butinferRemoteModelOpenAIShimConfig()still only recognizesdeepseek,kimi, andmoonshot. In a local proxy context,mimo-v2.5-pro,GLM-5.1, andzai/glm-5.1resolve to only{ maxTokensField: "max_tokens" }, so they still do not getpreserveReasoningContent,requireReasoningContentOnAssistantMessages, orreasoningContentFallback.
Suggested fix: Add model-name inference for the advertised providers, with focused tests for local proxy models, or narrow the PR description/provider claims to DeepSeek and Kimi/Moonshot only.
Validation
bun test src/services/api/openaiShim.test.tspassed: 93 tests.bun test src/integrations/routeMetadata.test.ts src/integrations/compatibility.test.ts src/services/api/providerConfig.local.test.tspassed: 40 tests.bun run buildpassed.- Manual local-proxy config check showed
deepseek-reasonerandkimi-k2.6infer reasoning config, butmimo-v2.5-pro,GLM-5.1, andzai/glm-5.1do not.
80a05e1 to
95f7fee
Compare
|
@techbrewboss I've addressed your feedback in commit 95f7fee — Changes:
|
techbrewboss
left a comment
There was a problem hiding this comment.
Review summary
Thanks for following up on the local/proxy reasoning-content path. The dead requireReasoningContentOnAssistantMessages flag is now wired into the shim and the MiMo/GLM local-proxy inference gap is addressed, but I found one direct-provider regression that should block merge.
Findings
src/integrations/runtimeMetadata.ts:166- MiMo model-name inference overrides the direct Xiaomi MiMo route’s token field.
Impact: directhttps://api.xiaomimimo.com/v1requests formimo-v2.5-pronow sendmax_tokensinstead of the descriptor-requiredmax_completion_tokens. This breaks the existingxiaomi mimo route uses api-key auth header and max_completion_tokenstest and contradicts the PR’s “No behavior changes” claim for direct providers.
Suggested fix: keep the new MiMo/GLM inference scoped to local/proxy or generic OpenAI-compatible routing, or avoid returningmaxTokensField: 'max_tokens'from MiMo model-name inference so the Xiaomi descriptor can continue to win.
Validation
- Fetched PR
Gitlawb/openclaude#1155into an isolated worktree at commit95f7fee. - Ran
bun test src/services/api/openaiShim.test.ts src/integrations/routeMetadata.test.ts src/integrations/compatibility.test.ts src/services/api/providerConfig.local.test.tswith the repo’s existingnode_modulessymlinked into the temporary worktree. - Result: 136 pass, 1 fail.
- Failing test:
xiaomi mimo route uses api-key auth header and max_completion_tokens.
Blockers
Non-Blocking
Looks Good
Verdict: Changes Requested — direct-provider regression must be fixed before merge. |
…serve reasoning through local proxies
What changed:
`requireReasoningContentOnAssistantMessages` was defined in descriptors.ts
and set by five vendor configs (deepseek, moonshot, mimo, zai, kimi-code)
but never consumed in the OpenAI shim's message conversion pipeline. This
meant any provider that required reasoning_content on assistant messages
would get a 400 error from the API ("reasoning_content must be passed back
to the API") whenever the flag was the only source of truth.
Additionally, the `treatAsLocal` code path in runtimeMetadata.ts bypassed
model-name-based config inference entirely, so local proxies (e.g. key
routers like grouter) forwarding to remote reasoning models never received
reasoning-related config even when the model name clearly indicated a
reasoning provider.
Why it changed:
- The flag was a dead config — set but never read — causing 400 errors for
all five vendors that rely on it
- Local proxies that forward to reasoning providers need the same config as
direct connections; the model name (e.g. "mimo-v2.5-pro") is sufficient
signal regardless of whether the base URL is localhost or remote
- The empty-string reasoning_content fallback was gated on toolUses.length > 0,
so plain-text assistant messages without tool calls were missing the field
- The coalescing pass silently dropped reasoning_content when merging
consecutive assistant messages
- Synthetic assistant messages injected during tool→user gaps and non-array
content assistant messages never received reasoning_content
User/developer impact:
- Fixes 400 errors for DeepSeek, Moonshot/Kimi, MiMo, ZAI, and any future
vendor that sets requireReasoningContentOnAssistantMessages: true
- Enables reasoning models to work through local proxies/key routers without
manual config
- No behavior change for providers that do not set the flag (all defaults
remain false/undefined)
Checks run:
- bun run build — clean
- bun run smoke — pass
- bun test src/services/api/openaiShim — 104 pass, 0 fail
When coalescing consecutive assistant messages, an empty
reasoning_content fallback ("") would overwrite a non-empty
thinking text from a previous message. Now we only overwrite
when the incoming value is non-empty or the existing value
is already empty/undefined.
inferRemoteModelOpenAIShimConfig() only recognized deepseek, kimi, and moonshot. When using a local proxy (e.g. GRouter) with mimo-v2.5-pro or GLM-5.1, the function returned undefined, so these models got no reasoning config even with the treatAsLocal merge fix. Now mimo and glm patterns are recognized and return the same reasoning config their vendor files define.
InferRemoteModelOpenAIShimConfig for mimo and glm returned maxTokensField and removeBodyFields, which overrode the Xiaomi MiMo vendor descriptor's max_completion_tokens when the direct route was used. Model-name inference should only return reasoning- related fields (preserveReasoningContent, requireReasoningContent, reasoningContentFallback, thinkingRequestFormat) so the descriptor remains the source of truth for transport fields like maxTokensField.
mergeOpenAIShimConfig now checks for a descriptor-level maxTokensField before applying the model-inferred value. This prevents direct providers (e.g. Xiaomi with max_completion_tokens) from being overridden by inference results that default to max_tokens. Gateway routes without a descriptor still receive the inferred value.
The treatAsLocal code path was dropping removeBodyFields from the model-inference config. For DeepSeek/Kimi models, inference returns removeBodyFields: ['store'] which was silently lost for local proxies. Currently mitigated by the isLocal store-stripping check, but this ensures future removeBodyFields entries are also preserved.
b818715 to
6dc6e84
Compare
… into fix/reasoning-content-dead-flag
MiMo's API expects `max_completion_tokens` (matching its vendor descriptor), but the model-name inference was not setting maxTokensField. When going through a localhost proxy (treatAsLocal path), the default `max_tokens` was applied instead, causing the API to ignore the token limit and terminate responses prematurely. Also adds missing removeBodyFields: ['store'] for both MiMo and GLM to match the pattern used by DeepSeek and Kimi inference configs.
The treatAsLocal code path hardcoded maxTokensField to 'max_tokens', which broke providers like MiMo that require 'max_completion_tokens' even when accessed through a local proxy (e.g. GRouter). Now checks if remoteModelInferredConfig has a maxTokensField before falling back to 'max_tokens', matching the pattern already used for reasoning fields.
techbrewboss
left a comment
There was a problem hiding this comment.
Review summary
The reasoning-content/runtime fix looks useful and the focused provider tests pass on the current PR head. I found one remaining scope issue: the branch includes release-please version/changelog output unrelated to this fix.
Findings
package.json:3,.release-please-manifest.json:2,CHANGELOG.md:3- Unrelated release-please version/changelog changes are included.
Impact: This reasoning-content PR now also bumps the package to0.12.1and adds changelog entries for unrelated PRs#1191and#1204. That mixes release automation output into a provider/runtime fix and can confuse the release process or make this PR appear to publish unrelated work.
Suggested fix: Drop the release-please commit/files from this branch and keep the PR limited to the shim/runtime changes.
Validation
- Fetched PR head
1348e0ainto an isolated worktree. bun test src/services/api/openaiShim.test.ts src/integrations/routeMetadata.test.ts src/integrations/compatibility.test.ts src/services/api/providerConfig.local.test.tspassed: 141 tests.bun run buildpassed.
jatmn
left a comment
There was a problem hiding this comment.
Thanks for following up on the earlier reviews. The requested reasoning_content coalescing fix still looks addressed, and the direct Xiaomi MiMo token-field regression is covered now. I found one remaining issue below.
Findings
- [P2] Drop unrelated release-please output from this branch
package.json:3,.release-please-manifest.json:2,CHANGELOG.md:3
This reasoning-content fix now also bumps the package version to0.12.1and adds changelog entries for unrelated PRs#1191and#1204. That mixes release automation output into a provider/runtime fix, which can confuse the release process and make this PR appear to publish unrelated work. Please remove the release-please commit/files from this branch and keep the PR scoped to the shim/runtime changes.
|
Confirmed @jatmn's and @techbrewboss's point against the branch — they've both landed on the same single remaining issue. The reasoning-content/runtime fix itself looks addressed (both reviewers agree), but the branch also carries release-please output: |
Fixes reasoning model compatibility issues for OpenAI-compatible localhost/proxy setups.
Previously, providers that require reasoning_content (DeepSeek, MiMo, Moonshot/Kimi, ZAI, etc.) could fail with:
400 OpenAI API error 400: data:{"error":{"code":"400","message":"Param Incorrect","param":"The reasoning_content in
the thinking mode must be passed back to the API.","type":""}}
during multi-turn agent workflows.
This especially affected local proxy/key-router setups such as:
Root Cause
Two independent issues caused the failures:
The flag existed in provider descriptors and was configured by:
However, the OpenAI shim message conversion pipeline never actually used the flag.
As a result, assistant messages could lose reasoning_content during agent execution.
The treatAsLocal code path skipped model-name-based config inference entirely.
This prevented localhost/proxy endpoints from inheriting reasoning-specific behavior even when the routed model was
clearly identifiable, such as:
Changes
src/services/api/openaiShim.ts
Implemented 5 fixes in convertMessages():
src/integrations/runtimeMetadata.ts
Provider Impact
Fixes reasoning-agent failures for:
Also enables reasoning models to work correctly through:
No behavior changes for providers that do not use the flag.
Tested
Manual validation:
Verified:
Follow-up
Some reasoning providers (Qwen/QwQ/GLM/etc.) are not yet included in inferRemoteModelOpenAIShimConfig().
Once entries are added, the treatAsLocal fix ensures they will automatically work through localhost/proxy setups as
well.