Skip to content

fix(llm): invert reasoning default — unknown models skip think/final tags#1952

Merged
henrypark133 merged 3 commits intostagingfrom
fix/nearai-reasoning-default
Apr 4, 2026
Merged

fix(llm): invert reasoning default — unknown models skip think/final tags#1952
henrypark133 merged 3 commits intostagingfrom
fix/nearai-reasoning-default

Conversation

@henrypark133
Copy link
Copy Markdown
Collaborator

@henrypark133 henrypark133 commented Apr 3, 2026

Summary

  • Invert reasoning default: New requires_think_final_tags() with empty allowlist — unknown/alias models (including NEAR AI auto) get the safe direct-answer prompt instead of <think>/<final> injection
  • Fix field name mismatch: Add #[serde(alias = "reasoning")] so vLLM/SGLang's reasoning field is accepted (was only reading reasoning_content)
  • Fix direct-answer prompt wording: Removed misleading "native reasoning" reference; prompt now reads "Respond directly with your final answer. Do not wrap your response in any special tags." — applicable to all model types
  • Remove model alias resolution: Dropped active_model update from response.modelrequires_think_final_tags() returns false for both alias and resolved names, so resolution is unnecessary

Change Type

  • Bug fix
  • New feature
  • Refactor
  • Documentation
  • CI/Infrastructure
  • Security
  • Dependencies

Linked Issue

None

Validation

  • cargo fmt --all -- --check
  • cargo clippy --all --benches --tests --examples --all-features -- -D warnings
  • cargo build
  • Relevant tests pass: unit tests for requires_think_final_tags(), system prompt formatting, reasoning serde alias
  • cargo test --features integration if database-backed or integration behavior changed
  • Manual testing: confirmed via direct curl testing against NEAR AI staging API; cargo test --lib — 4174 passed, 0 failed

Security Impact

None

Database Impact

None

Blast Radius

Touches src/llm/reasoning.rs, src/llm/reasoning_models.rs, src/llm/nearai_chat.rs. Could affect prompt formatting for all models routed through the NEAR AI provider, but the change is conservative — unknown/alias models now default to the simpler direct-answer prompt rather than injecting <think>/<final> tags.

Rollback Plan

Revert commits on this branch. No schema or config changes; rollback is a straight revert.

Review Follow-Through

  • Model alias resolution (active_model from response.model) was prototyped and removed — requires_think_final_tags() returns false for both "auto" and any resolved name, making the resolution unnecessary.
  • Direct-answer system prompt wording was updated to be model-agnostic per reviewer feedback.

Review track: B (feature/maintainer-requested refactor)

…al> injection

When NEAR AI model="auto" resolves server-side to Qwen 3.5, the system
prompt injected <think>/<final> tags because "auto" didn't match any
known native-thinking pattern. This caused empty responses:

1. Qwen 3.5's native thinking puts reasoning in a `reasoning` field
   (not `reasoning_content`) — silently dropped due to field name mismatch
2. Content contained only <think> tags or <tool_call> XML, which
   clean_response() stripped to empty → "I'm not sure how to respond"

Three fixes:
- Invert the default: new requires_think_final_tags() with empty allowlist
  means unknown/alias models get the safe direct-answer prompt
- Add #[serde(alias = "reasoning")] so vLLM's field name is accepted
- Update active_model from API response.model so capability checks
  use the resolved model name after the first call

Confirmed via direct API testing against NEAR AI staging with
Qwen/Qwen3.5-122B-A10B.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 3, 2026 04:44
@github-actions github-actions bot added scope: llm LLM integration size: L 200-499 changed lines risk: low Changes to docs, tests, or low-risk modules contributor: core 20+ merged PRs labels Apr 3, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the LLM “reasoning” pipeline by making the safe direct-answer prompt the default (skipping <think>/<final> injection for unknown/alias models), improving compatibility with native-thinking models and NEAR AI’s "auto" alias behavior.

Changes:

  • Invert the prompt-injection default via requires_think_final_tags() (empty allowlist) so unknown/alias models use the direct-answer format.
  • Accept vLLM/SGLang’s reasoning field via #[serde(alias = "reasoning")] when deserializing NEAR AI chat responses.
  • Update NEAR AI provider active_model from response.model so post-alias capability checks use the resolved model name.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/llm/reasoning.rs Switches system-prompt formatting default to direct-answer and updates related tests.
src/llm/reasoning_models.rs Introduces requires_think_final_tags() allowlist-based decision (currently empty) and adds unit tests.
src/llm/nearai_chat.rs Parses response.model, resolves model aliases into active_model, and adds serde alias/tests for reasoning.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/llm/nearai_chat.rs Outdated
Comment thread src/llm/nearai_chat.rs Outdated
Comment on lines +577 to +578
if let Some(ref resolved) = response.model
&& *resolved != self.config.model
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as in complete(): comparing response.model to self.config.model will keep re-setting active_model on every request when the configured model is an alias (e.g. "auto"). Compare against the current active_model instead so this is only done when the resolved name changes.

Suggested change
if let Some(ref resolved) = response.model
&& *resolved != self.config.model
let active_model = self.active_model_name();
if let Some(ref resolved) = response.model
&& *resolved != active_model

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same — removed in 112ab13.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 112ab13 — same resolution as the complete() instance.

Comment thread src/llm/reasoning.rs Outdated
auto should stay as the active model name — no reason to overwrite it
with the resolved model since requires_think_final_tags() returns false
for both "auto" and the resolved name.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances model alias resolution and reasoning field handling for the NEAR AI provider. It introduces a "safe default" for prompt injection, only applying / tags to explicitly allowlisted models to avoid breaking native-thinking LLMs. Feedback suggests optimizing the model resolution logic to prevent redundant write locks and ensuring that per-request model overrides do not inadvertently update the provider's global state.

I am having trouble creating individual review comments. Click here to see my feedback.

src/llm/nearai_chat.rs (491-495)

medium

The current check against self.config.model leads to redundant write lock acquisitions on every request when using an alias like "auto", because the resolved model name will always differ from the original configuration. Comparing against self.active_model_name() avoids this overhead. Furthermore, adding a check for req.model.is_none() prevents per-request model overrides from unexpectedly updating the provider's global active_model state.

        if req.model.is_none()
            && let Some(ref resolved) = response.model
            && resolved != &self.active_model_name()
        {
            let _ = self.set_model(resolved);
        }

src/llm/nearai_chat.rs (577-581)

medium

The current check against self.config.model leads to redundant write lock acquisitions on every request when using an alias like "auto", because the resolved model name will always differ from the original configuration. Comparing against self.active_model_name() avoids this overhead. Furthermore, adding a check for req.model.is_none() prevents per-request model overrides from unexpectedly updating the provider's global active_model state.

        if req.model.is_none()
            && let Some(ref resolved) = response.model
            && resolved != &self.active_model_name()
        {
            let _ = self.set_model(resolved);
        }

The direct-answer prompt is now the default for all models, not just
native-thinking ones. Remove misleading "handled natively" language.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 3, 2026 04:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/llm/nearai_chat.rs
Copilot AI changed the title fix(llm): invert reasoning default — unknown models skip <think>/<final> injection fix(llm): invert reasoning default — unknown models skip think/final tags Apr 3, 2026
Copy link
Copy Markdown
Collaborator

Review finding from the current PR diff:

  1. Medium - the prompt-behavior change is global, but the new coverage only proves local string/deserialization behavior.

Reasoning::build_system_prompt_with_tools() now sends every model down the direct-answer path unless requires_think_final_tags() returns true, and that allowlist is currently empty. In practice this is broader than the NEAR AI "auto" / unknown-model fix: it removes the old <think>/<final> prompt path for every provider/model, with no runtime escape hatch if a non-native model still depends on that framing.

The tests added here only verify prompt-string construction and the reasoning serde alias in nearai_chat.rs. There still isn’t an end-to-end regression test showing that a regular non-native model preserves correct final-answer / tool-call behavior under the new shared default.

Suggested fix: either keep a scoped override/allowlist path for any known tag-dependent models, or add an end-to-end respond_with_tools() regression test that exercises a normal non-native model under the new default so this global behavior change is actually covered.

@henrypark133 henrypark133 merged commit 0588dd1 into staging Apr 4, 2026
19 checks passed
@henrypark133 henrypark133 deleted the fix/nearai-reasoning-default branch April 4, 2026 00:51
serrrfirat pushed a commit that referenced this pull request Apr 5, 2026
…tags (#1952)

* fix(llm): invert reasoning default — unknown models skip <think>/<final> injection

When NEAR AI model="auto" resolves server-side to Qwen 3.5, the system
prompt injected <think>/<final> tags because "auto" didn't match any
known native-thinking pattern. This caused empty responses:

1. Qwen 3.5's native thinking puts reasoning in a `reasoning` field
   (not `reasoning_content`) — silently dropped due to field name mismatch
2. Content contained only <think> tags or <tool_call> XML, which
   clean_response() stripped to empty → "I'm not sure how to respond"

Three fixes:
- Invert the default: new requires_think_final_tags() with empty allowlist
  means unknown/alias models get the safe direct-answer prompt
- Add #[serde(alias = "reasoning")] so vLLM's field name is accepted
- Update active_model from API response.model so capability checks
  use the resolved model name after the first call

Confirmed via direct API testing against NEAR AI staging with
Qwen/Qwen3.5-122B-A10B.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* remove model alias resolution from nearai_chat

auto should stay as the active model name — no reason to overwrite it
with the resolved model since requires_think_final_tags() returns false
for both "auto" and the resolved name.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix wording: remove native-thinking assumption from direct-answer prompt

The direct-answer prompt is now the default for all models, not just
native-thinking ones. Remove misleading "handled natively" language.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This was referenced Apr 6, 2026
drchirag1991 pushed a commit to drchirag1991/ironclaw that referenced this pull request Apr 8, 2026
…tags (nearai#1952)

* fix(llm): invert reasoning default — unknown models skip <think>/<final> injection

When NEAR AI model="auto" resolves server-side to Qwen 3.5, the system
prompt injected <think>/<final> tags because "auto" didn't match any
known native-thinking pattern. This caused empty responses:

1. Qwen 3.5's native thinking puts reasoning in a `reasoning` field
   (not `reasoning_content`) — silently dropped due to field name mismatch
2. Content contained only <think> tags or <tool_call> XML, which
   clean_response() stripped to empty → "I'm not sure how to respond"

Three fixes:
- Invert the default: new requires_think_final_tags() with empty allowlist
  means unknown/alias models get the safe direct-answer prompt
- Add #[serde(alias = "reasoning")] so vLLM's field name is accepted
- Update active_model from API response.model so capability checks
  use the resolved model name after the first call

Confirmed via direct API testing against NEAR AI staging with
Qwen/Qwen3.5-122B-A10B.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* remove model alias resolution from nearai_chat

auto should stay as the active model name — no reason to overwrite it
with the resolved model since requires_think_final_tags() returns false
for both "auto" and the resolved name.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix wording: remove native-thinking assumption from direct-answer prompt

The direct-answer prompt is now the default for all models, not just
native-thinking ones. Remove misleading "handled natively" language.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ironclaw-ci ironclaw-ci bot mentioned this pull request Apr 10, 2026
JZKK720 pushed a commit to JZKK720/ironclaw that referenced this pull request Apr 13, 2026
…tags (nearai#1952)

* fix(llm): invert reasoning default — unknown models skip <think>/<final> injection

When NEAR AI model="auto" resolves server-side to Qwen 3.5, the system
prompt injected <think>/<final> tags because "auto" didn't match any
known native-thinking pattern. This caused empty responses:

1. Qwen 3.5's native thinking puts reasoning in a `reasoning` field
   (not `reasoning_content`) — silently dropped due to field name mismatch
2. Content contained only <think> tags or <tool_call> XML, which
   clean_response() stripped to empty → "I'm not sure how to respond"

Three fixes:
- Invert the default: new requires_think_final_tags() with empty allowlist
  means unknown/alias models get the safe direct-answer prompt
- Add #[serde(alias = "reasoning")] so vLLM's field name is accepted
- Update active_model from API response.model so capability checks
  use the resolved model name after the first call

Confirmed via direct API testing against NEAR AI staging with
Qwen/Qwen3.5-122B-A10B.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* remove model alias resolution from nearai_chat

auto should stay as the active model name — no reason to overwrite it
with the resolved model since requires_think_final_tags() returns false
for both "auto" and the resolved name.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix wording: remove native-thinking assumption from direct-answer prompt

The direct-answer prompt is now the default for all models, not just
native-thinking ones. Remove misleading "handled natively" language.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
(cherry picked from commit 0588dd1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: low Changes to docs, tests, or low-risk modules scope: llm LLM integration size: L 200-499 changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants