fix(llm): invert reasoning default — unknown models skip think/final tags#1952
fix(llm): invert reasoning default — unknown models skip think/final tags#1952henrypark133 merged 3 commits intostagingfrom
Conversation
…al> injection When NEAR AI model="auto" resolves server-side to Qwen 3.5, the system prompt injected <think>/<final> tags because "auto" didn't match any known native-thinking pattern. This caused empty responses: 1. Qwen 3.5's native thinking puts reasoning in a `reasoning` field (not `reasoning_content`) — silently dropped due to field name mismatch 2. Content contained only <think> tags or <tool_call> XML, which clean_response() stripped to empty → "I'm not sure how to respond" Three fixes: - Invert the default: new requires_think_final_tags() with empty allowlist means unknown/alias models get the safe direct-answer prompt - Add #[serde(alias = "reasoning")] so vLLM's field name is accepted - Update active_model from API response.model so capability checks use the resolved model name after the first call Confirmed via direct API testing against NEAR AI staging with Qwen/Qwen3.5-122B-A10B. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR hardens the LLM “reasoning” pipeline by making the safe direct-answer prompt the default (skipping <think>/<final> injection for unknown/alias models), improving compatibility with native-thinking models and NEAR AI’s "auto" alias behavior.
Changes:
- Invert the prompt-injection default via
requires_think_final_tags()(empty allowlist) so unknown/alias models use the direct-answer format. - Accept vLLM/SGLang’s
reasoningfield via#[serde(alias = "reasoning")]when deserializing NEAR AI chat responses. - Update NEAR AI provider
active_modelfromresponse.modelso post-alias capability checks use the resolved model name.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/llm/reasoning.rs |
Switches system-prompt formatting default to direct-answer and updates related tests. |
src/llm/reasoning_models.rs |
Introduces requires_think_final_tags() allowlist-based decision (currently empty) and adds unit tests. |
src/llm/nearai_chat.rs |
Parses response.model, resolves model aliases into active_model, and adds serde alias/tests for reasoning. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if let Some(ref resolved) = response.model | ||
| && *resolved != self.config.model |
There was a problem hiding this comment.
Same issue as in complete(): comparing response.model to self.config.model will keep re-setting active_model on every request when the configured model is an alias (e.g. "auto"). Compare against the current active_model instead so this is only done when the resolved name changes.
| if let Some(ref resolved) = response.model | |
| && *resolved != self.config.model | |
| let active_model = self.active_model_name(); | |
| if let Some(ref resolved) = response.model | |
| && *resolved != active_model |
There was a problem hiding this comment.
Removed in 112ab13 — same resolution as the complete() instance.
auto should stay as the active model name — no reason to overwrite it with the resolved model since requires_think_final_tags() returns false for both "auto" and the resolved name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request enhances model alias resolution and reasoning field handling for the NEAR AI provider. It introduces a "safe default" for prompt injection, only applying / tags to explicitly allowlisted models to avoid breaking native-thinking LLMs. Feedback suggests optimizing the model resolution logic to prevent redundant write locks and ensuring that per-request model overrides do not inadvertently update the provider's global state.
I am having trouble creating individual review comments. Click here to see my feedback.
src/llm/nearai_chat.rs (491-495)
The current check against self.config.model leads to redundant write lock acquisitions on every request when using an alias like "auto", because the resolved model name will always differ from the original configuration. Comparing against self.active_model_name() avoids this overhead. Furthermore, adding a check for req.model.is_none() prevents per-request model overrides from unexpectedly updating the provider's global active_model state.
if req.model.is_none()
&& let Some(ref resolved) = response.model
&& resolved != &self.active_model_name()
{
let _ = self.set_model(resolved);
}
src/llm/nearai_chat.rs (577-581)
The current check against self.config.model leads to redundant write lock acquisitions on every request when using an alias like "auto", because the resolved model name will always differ from the original configuration. Comparing against self.active_model_name() avoids this overhead. Furthermore, adding a check for req.model.is_none() prevents per-request model overrides from unexpectedly updating the provider's global active_model state.
if req.model.is_none()
&& let Some(ref resolved) = response.model
&& resolved != &self.active_model_name()
{
let _ = self.set_model(resolved);
}
The direct-answer prompt is now the default for all models, not just native-thinking ones. Remove misleading "handled natively" language. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Review finding from the current PR diff:
The tests added here only verify prompt-string construction and the Suggested fix: either keep a scoped override/allowlist path for any known tag-dependent models, or add an end-to-end |
…tags (#1952) * fix(llm): invert reasoning default — unknown models skip <think>/<final> injection When NEAR AI model="auto" resolves server-side to Qwen 3.5, the system prompt injected <think>/<final> tags because "auto" didn't match any known native-thinking pattern. This caused empty responses: 1. Qwen 3.5's native thinking puts reasoning in a `reasoning` field (not `reasoning_content`) — silently dropped due to field name mismatch 2. Content contained only <think> tags or <tool_call> XML, which clean_response() stripped to empty → "I'm not sure how to respond" Three fixes: - Invert the default: new requires_think_final_tags() with empty allowlist means unknown/alias models get the safe direct-answer prompt - Add #[serde(alias = "reasoning")] so vLLM's field name is accepted - Update active_model from API response.model so capability checks use the resolved model name after the first call Confirmed via direct API testing against NEAR AI staging with Qwen/Qwen3.5-122B-A10B. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * remove model alias resolution from nearai_chat auto should stay as the active model name — no reason to overwrite it with the resolved model since requires_think_final_tags() returns false for both "auto" and the resolved name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix wording: remove native-thinking assumption from direct-answer prompt The direct-answer prompt is now the default for all models, not just native-thinking ones. Remove misleading "handled natively" language. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tags (nearai#1952) * fix(llm): invert reasoning default — unknown models skip <think>/<final> injection When NEAR AI model="auto" resolves server-side to Qwen 3.5, the system prompt injected <think>/<final> tags because "auto" didn't match any known native-thinking pattern. This caused empty responses: 1. Qwen 3.5's native thinking puts reasoning in a `reasoning` field (not `reasoning_content`) — silently dropped due to field name mismatch 2. Content contained only <think> tags or <tool_call> XML, which clean_response() stripped to empty → "I'm not sure how to respond" Three fixes: - Invert the default: new requires_think_final_tags() with empty allowlist means unknown/alias models get the safe direct-answer prompt - Add #[serde(alias = "reasoning")] so vLLM's field name is accepted - Update active_model from API response.model so capability checks use the resolved model name after the first call Confirmed via direct API testing against NEAR AI staging with Qwen/Qwen3.5-122B-A10B. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * remove model alias resolution from nearai_chat auto should stay as the active model name — no reason to overwrite it with the resolved model since requires_think_final_tags() returns false for both "auto" and the resolved name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix wording: remove native-thinking assumption from direct-answer prompt The direct-answer prompt is now the default for all models, not just native-thinking ones. Remove misleading "handled natively" language. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tags (nearai#1952) * fix(llm): invert reasoning default — unknown models skip <think>/<final> injection When NEAR AI model="auto" resolves server-side to Qwen 3.5, the system prompt injected <think>/<final> tags because "auto" didn't match any known native-thinking pattern. This caused empty responses: 1. Qwen 3.5's native thinking puts reasoning in a `reasoning` field (not `reasoning_content`) — silently dropped due to field name mismatch 2. Content contained only <think> tags or <tool_call> XML, which clean_response() stripped to empty → "I'm not sure how to respond" Three fixes: - Invert the default: new requires_think_final_tags() with empty allowlist means unknown/alias models get the safe direct-answer prompt - Add #[serde(alias = "reasoning")] so vLLM's field name is accepted - Update active_model from API response.model so capability checks use the resolved model name after the first call Confirmed via direct API testing against NEAR AI staging with Qwen/Qwen3.5-122B-A10B. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * remove model alias resolution from nearai_chat auto should stay as the active model name — no reason to overwrite it with the resolved model since requires_think_final_tags() returns false for both "auto" and the resolved name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix wording: remove native-thinking assumption from direct-answer prompt The direct-answer prompt is now the default for all models, not just native-thinking ones. Remove misleading "handled natively" language. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> (cherry picked from commit 0588dd1)
Summary
requires_think_final_tags()with empty allowlist — unknown/alias models (including NEAR AIauto) get the safe direct-answer prompt instead of<think>/<final>injection#[serde(alias = "reasoning")]so vLLM/SGLang'sreasoningfield is accepted (was only readingreasoning_content)active_modelupdate fromresponse.model—requires_think_final_tags()returnsfalsefor both alias and resolved names, so resolution is unnecessaryChange Type
Linked Issue
None
Validation
cargo fmt --all -- --checkcargo clippy --all --benches --tests --examples --all-features -- -D warningscargo buildrequires_think_final_tags(), system prompt formatting,reasoningserde aliascargo test --features integrationif database-backed or integration behavior changedcargo test --lib— 4174 passed, 0 failedSecurity Impact
None
Database Impact
None
Blast Radius
Touches
src/llm/reasoning.rs,src/llm/reasoning_models.rs,src/llm/nearai_chat.rs. Could affect prompt formatting for all models routed through the NEAR AI provider, but the change is conservative — unknown/alias models now default to the simpler direct-answer prompt rather than injecting<think>/<final>tags.Rollback Plan
Revert commits on this branch. No schema or config changes; rollback is a straight revert.
Review Follow-Through
active_modelfromresponse.model) was prototyped and removed —requires_think_final_tags()returnsfalsefor both"auto"and any resolved name, making the resolution unnecessary.Review track: B (feature/maintainer-requested refactor)