fix(llm): invert reasoning default — unknown models skip think/final tags by henrypark133 · Pull Request #1952 · nearai/ironclaw

henrypark133 · 2026-04-03T04:44:13Z

Summary

Invert reasoning default: New requires_think_final_tags() with empty allowlist — unknown/alias models (including NEAR AI auto) get the safe direct-answer prompt instead of <think>/<final> injection
Fix field name mismatch: Add #[serde(alias = "reasoning")] so vLLM/SGLang's reasoning field is accepted (was only reading reasoning_content)
Fix direct-answer prompt wording: Removed misleading "native reasoning" reference; prompt now reads "Respond directly with your final answer. Do not wrap your response in any special tags." — applicable to all model types
Remove model alias resolution: Dropped active_model update from response.model — requires_think_final_tags() returns false for both alias and resolved names, so resolution is unnecessary

Change Type

Linked Issue

None

Validation

cargo fmt --all -- --check
cargo clippy --all --benches --tests --examples --all-features -- -D warnings
cargo build
Relevant tests pass: unit tests for requires_think_final_tags(), system prompt formatting, reasoning serde alias
cargo test --features integration if database-backed or integration behavior changed
Manual testing: confirmed via direct curl testing against NEAR AI staging API; cargo test --lib — 4174 passed, 0 failed

Security Impact

None

Database Impact

None

Blast Radius

Touches src/llm/reasoning.rs, src/llm/reasoning_models.rs, src/llm/nearai_chat.rs. Could affect prompt formatting for all models routed through the NEAR AI provider, but the change is conservative — unknown/alias models now default to the simpler direct-answer prompt rather than injecting <think>/<final> tags.

Rollback Plan

Revert commits on this branch. No schema or config changes; rollback is a straight revert.

Review Follow-Through

Model alias resolution (active_model from response.model) was prototyped and removed — requires_think_final_tags() returns false for both "auto" and any resolved name, making the resolution unnecessary.
Direct-answer system prompt wording was updated to be model-agnostic per reviewer feedback.

Review track: B (feature/maintainer-requested refactor)

…al> injection When NEAR AI model="auto" resolves server-side to Qwen 3.5, the system prompt injected <think>/<final> tags because "auto" didn't match any known native-thinking pattern. This caused empty responses: 1. Qwen 3.5's native thinking puts reasoning in a `reasoning` field (not `reasoning_content`) — silently dropped due to field name mismatch 2. Content contained only <think> tags or <tool_call> XML, which clean_response() stripped to empty → "I'm not sure how to respond" Three fixes: - Invert the default: new requires_think_final_tags() with empty allowlist means unknown/alias models get the safe direct-answer prompt - Add #[serde(alias = "reasoning")] so vLLM's field name is accepted - Update active_model from API response.model so capability checks use the resolved model name after the first call Confirmed via direct API testing against NEAR AI staging with Qwen/Qwen3.5-122B-A10B. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR hardens the LLM “reasoning” pipeline by making the safe direct-answer prompt the default (skipping <think>/<final> injection for unknown/alias models), improving compatibility with native-thinking models and NEAR AI’s "auto" alias behavior.

Changes:

Invert the prompt-injection default via requires_think_final_tags() (empty allowlist) so unknown/alias models use the direct-answer format.
Accept vLLM/SGLang’s reasoning field via #[serde(alias = "reasoning")] when deserializing NEAR AI chat responses.
Update NEAR AI provider active_model from response.model so post-alias capability checks use the resolved model name.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`src/llm/reasoning.rs`	Switches system-prompt formatting default to direct-answer and updates related tests.
`src/llm/reasoning_models.rs`	Introduces `requires_think_final_tags()` allowlist-based decision (currently empty) and adds unit tests.
`src/llm/nearai_chat.rs`	Parses `response.model`, resolves model aliases into `active_model`, and adds serde alias/tests for `reasoning`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-03T04:47:28Z

+        if let Some(ref resolved) = response.model
+            && *resolved != self.config.model


Same issue as in complete(): comparing response.model to self.config.model will keep re-setting active_model on every request when the configured model is an alias (e.g. "auto"). Compare against the current active_model instead so this is only done when the resolved name changes.

Suggested change

if let Some(ref resolved) = response.model

&& *resolved != self.config.model

let active_model = self.active_model_name();

if let Some(ref resolved) = response.model

&& *resolved != active_model

Same — removed in 112ab13.

Removed in 112ab13 — same resolution as the complete() instance.

auto should stay as the active model name — no reason to overwrite it with the resolved model since requires_think_final_tags() returns false for both "auto" and the resolved name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request enhances model alias resolution and reasoning field handling for the NEAR AI provider. It introduces a "safe default" for prompt injection, only applying / tags to explicitly allowlisted models to avoid breaking native-thinking LLMs. Feedback suggests optimizing the model resolution logic to prevent redundant write locks and ensuring that per-request model overrides do not inadvertently update the provider's global state.

I am having trouble creating individual review comments. Click here to see my feedback.

src/llm/nearai_chat.rs (491-495)

The current check against self.config.model leads to redundant write lock acquisitions on every request when using an alias like "auto", because the resolved model name will always differ from the original configuration. Comparing against self.active_model_name() avoids this overhead. Furthermore, adding a check for req.model.is_none() prevents per-request model overrides from unexpectedly updating the provider's global active_model state.

        if req.model.is_none()
            && let Some(ref resolved) = response.model
            && resolved != &self.active_model_name()
        {
            let _ = self.set_model(resolved);
        }

src/llm/nearai_chat.rs (577-581)

The current check against self.config.model leads to redundant write lock acquisitions on every request when using an alias like "auto", because the resolved model name will always differ from the original configuration. Comparing against self.active_model_name() avoids this overhead. Furthermore, adding a check for req.model.is_none() prevents per-request model overrides from unexpectedly updating the provider's global active_model state.

        if req.model.is_none()
            && let Some(ref resolved) = response.model
            && resolved != &self.active_model_name()
        {
            let _ = self.set_model(resolved);
        }

The direct-answer prompt is now the default for all models, not just native-thinking ones. Remove misleading "handled natively" language. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

serrrfirat · 2026-04-03T10:58:10Z

Review finding from the current PR diff:

Medium - the prompt-behavior change is global, but the new coverage only proves local string/deserialization behavior.

Reasoning::build_system_prompt_with_tools() now sends every model down the direct-answer path unless requires_think_final_tags() returns true, and that allowlist is currently empty. In practice this is broader than the NEAR AI "auto" / unknown-model fix: it removes the old <think>/<final> prompt path for every provider/model, with no runtime escape hatch if a non-native model still depends on that framing.

The tests added here only verify prompt-string construction and the reasoning serde alias in nearai_chat.rs. There still isn’t an end-to-end regression test showing that a regular non-native model preserves correct final-answer / tool-call behavior under the new shared default.

Suggested fix: either keep a scoped override/allowlist path for any known tag-dependent models, or add an end-to-end respond_with_tools() regression test that exercises a normal non-native model under the new default so this global behavior change is actually covered.

…tags (#1952) * fix(llm): invert reasoning default — unknown models skip <think>/<final> injection When NEAR AI model="auto" resolves server-side to Qwen 3.5, the system prompt injected <think>/<final> tags because "auto" didn't match any known native-thinking pattern. This caused empty responses: 1. Qwen 3.5's native thinking puts reasoning in a `reasoning` field (not `reasoning_content`) — silently dropped due to field name mismatch 2. Content contained only <think> tags or <tool_call> XML, which clean_response() stripped to empty → "I'm not sure how to respond" Three fixes: - Invert the default: new requires_think_final_tags() with empty allowlist means unknown/alias models get the safe direct-answer prompt - Add #[serde(alias = "reasoning")] so vLLM's field name is accepted - Update active_model from API response.model so capability checks use the resolved model name after the first call Confirmed via direct API testing against NEAR AI staging with Qwen/Qwen3.5-122B-A10B. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * remove model alias resolution from nearai_chat auto should stay as the active model name — no reason to overwrite it with the resolved model since requires_think_final_tags() returns false for both "auto" and the resolved name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix wording: remove native-thinking assumption from direct-answer prompt The direct-answer prompt is now the default for all models, not just native-thinking ones. Remove misleading "handled natively" language. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…tags (nearai#1952) * fix(llm): invert reasoning default — unknown models skip <think>/<final> injection When NEAR AI model="auto" resolves server-side to Qwen 3.5, the system prompt injected <think>/<final> tags because "auto" didn't match any known native-thinking pattern. This caused empty responses: 1. Qwen 3.5's native thinking puts reasoning in a `reasoning` field (not `reasoning_content`) — silently dropped due to field name mismatch 2. Content contained only <think> tags or <tool_call> XML, which clean_response() stripped to empty → "I'm not sure how to respond" Three fixes: - Invert the default: new requires_think_final_tags() with empty allowlist means unknown/alias models get the safe direct-answer prompt - Add #[serde(alias = "reasoning")] so vLLM's field name is accepted - Update active_model from API response.model so capability checks use the resolved model name after the first call Confirmed via direct API testing against NEAR AI staging with Qwen/Qwen3.5-122B-A10B. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * remove model alias resolution from nearai_chat auto should stay as the active model name — no reason to overwrite it with the resolved model since requires_think_final_tags() returns false for both "auto" and the resolved name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix wording: remove native-thinking assumption from direct-answer prompt The direct-answer prompt is now the default for all models, not just native-thinking ones. Remove misleading "handled natively" language. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…tags (nearai#1952) * fix(llm): invert reasoning default — unknown models skip <think>/<final> injection When NEAR AI model="auto" resolves server-side to Qwen 3.5, the system prompt injected <think>/<final> tags because "auto" didn't match any known native-thinking pattern. This caused empty responses: 1. Qwen 3.5's native thinking puts reasoning in a `reasoning` field (not `reasoning_content`) — silently dropped due to field name mismatch 2. Content contained only <think> tags or <tool_call> XML, which clean_response() stripped to empty → "I'm not sure how to respond" Three fixes: - Invert the default: new requires_think_final_tags() with empty allowlist means unknown/alias models get the safe direct-answer prompt - Add #[serde(alias = "reasoning")] so vLLM's field name is accepted - Update active_model from API response.model so capability checks use the resolved model name after the first call Confirmed via direct API testing against NEAR AI staging with Qwen/Qwen3.5-122B-A10B. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * remove model alias resolution from nearai_chat auto should stay as the active model name — no reason to overwrite it with the resolved model since requires_think_final_tags() returns false for both "auto" and the resolved name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix wording: remove native-thinking assumption from direct-answer prompt The direct-answer prompt is now the default for all models, not just native-thinking ones. Remove misleading "handled natively" language. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> (cherry picked from commit 0588dd1)

Copilot AI review requested due to automatic review settings April 3, 2026 04:44

github-actions bot added scope: llm LLM integration size: L 200-499 changed lines risk: low Changes to docs, tests, or low-risk modules contributor: core 20+ merged PRs labels Apr 3, 2026

Copilot started reviewing on behalf of henrypark133 April 3, 2026 04:44 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

gemini-code-assist bot reviewed Apr 3, 2026

View reviewed changes

fix wording: remove native-thinking assumption from direct-answer prompt

7e40a1a

The direct-answer prompt is now the default for all models, not just native-thinking ones. Remove misleading "handled natively" language. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings April 3, 2026 04:55

Copilot started reviewing on behalf of henrypark133 April 3, 2026 04:56 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

Comment thread src/llm/nearai_chat.rs

Copilot started work on behalf of henrypark133 April 3, 2026 05:13 View session

Copilot AI changed the title ~~fix(llm): invert reasoning default — unknown models skip <think>/<final> injection~~ fix(llm): invert reasoning default — unknown models skip think/final tags Apr 3, 2026

Copilot finished work on behalf of henrypark133 April 3, 2026 05:15

nickpismenkov approved these changes Apr 4, 2026

View reviewed changes

henrypark133 merged commit 0588dd1 into staging Apr 4, 2026
19 checks passed

henrypark133 deleted the fix/nearai-reasoning-default branch April 4, 2026 00:51

github-actions bot mentioned this pull request Apr 4, 2026

🦞 OpenClaw 生态日报 2026-04-04 gsscsd/big_model_radar#131

Open

ironclaw-ci bot mentioned this pull request Apr 5, 2026

chore: promote staging to staging-promote/733678dd-23996777140 (2026-04-05 13:23 UTC) #2044

Merged

github-actions bot mentioned this pull request Apr 9, 2026

chore: promote staging to staging-promote/4c9a985b-23931806540 (2026-04-03 05:32 UTC) #1953

Merged

ironclaw-ci bot mentioned this pull request Apr 9, 2026

chore: promote staging to staging-promote/288fe49a-24110798843 (2026-04-09 09:26 UTC) #2187

Merged

github-actions bot mentioned this pull request Apr 10, 2026

chore: promote staging to staging-promote/42623ed1-23780941831 (2026-04-01 23:10 UTC) #1893

Merged

ironclaw-ci bot mentioned this pull request Apr 10, 2026

chore: release #2075

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llm): invert reasoning default — unknown models skip think/final tags#1952

fix(llm): invert reasoning default — unknown models skip think/final tags#1952
henrypark133 merged 3 commits intostagingfrom
fix/nearai-reasoning-default

henrypark133 commented Apr 3, 2026 •

edited by Copilot AI

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

henrypark133 Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

serrrfirat commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		if let Some(ref resolved) = response.model
		&& *resolved != self.config.model

Conversation

henrypark133 commented Apr 3, 2026 • edited by Copilot AI Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type

Linked Issue

Validation

Security Impact

Database Impact

Blast Radius

Rollback Plan

Review Follow-Through

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

henrypark133 Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

src/llm/nearai_chat.rs (491-495)

src/llm/nearai_chat.rs (577-581)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

serrrfirat commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

henrypark133 commented Apr 3, 2026 •

edited by Copilot AI

Loading