fix: don't discard model output past a literal reasoning close tag#5395
Open
nolanchic wants to merge 1 commit into
Open
fix: don't discard model output past a literal reasoning close tag#5395nolanchic wants to merge 1 commit into
nolanchic wants to merge 1 commit into
Conversation
remove_reasoning_content strips <tag>...</tag> reasoning blocks, then has a
fallback that drops everything before a remaining </tag> (for the case where
the opening tag was emitted in a prior chunk, e.g. a length-limited
continuation). That fallback fired for *any* remaining </tag>, so when the
model's actual answer contained a literal </tag> (inside code, markup, or
prose) AND the response also had a real reasoning block, the real answer up
to that literal </tag> was silently discarded.
Example (reasoning_tag="think"):
in: <think>r</think>Here is code: print('</think>')
was: ')
now: Here is code: print('</think>')
This runs on every assistant response for reasoning-tag models and on every
commit-message / summarization call, so the truncation was silent and
affected downstream edit parsing.
Only treat a remaining </tag> as an orphan close when the response had no
opening tag at all. When it did, any </tag> left after the balanced-block
substitution is a literal occurrence and is preserved.
Adds regression tests for both the literal case and the orphan case.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
For models with a
reasoning_tag(e.g. DeepSeek-R1 →think),remove_reasoning_contentsilently discarded part of the model's real answer whenever the answer contained a literal closing tag (e.g.</think>inside code, markup, or prose) and the response also contained a real reasoning block.What happens: the function first strips balanced
<think>...</think>blocks, then has a fallback that drops everything before any remaining</think>(intended for the continuation case where the opening tag was emitted in a prior chunk). That fallback fired unconditionally, so a literal</think>in the answer was mistaken for an orphan reasoning close.This function runs on every assistant response for reasoning-tag models (
Coder.remove_reasoning_content) and on every commit-message / chat-summarization call (Model.simple_send_with_retries), so the truncation was silent and the shortened text was what reached edit parsing / commit-message generation.Fix
aider/reasoning_tags.py— only treat a remaining</tag>as an orphan close when the response contained no opening tag at all (the continuation case). When an opening tag was present, any</tag>left after the balanced-block substitution is a literal occurrence in the answer and is preserved.The legitimate orphan-close behaviour (continuation whose
<tag>was in a prior chunk) is unchanged.Test plan
test_remove_reasoning_content_literal_closing_tag— reproduces the truncation onmain(returns')'), passes with the fix.test_remove_reasoning_content_orphan_closing_tag_stripped— locks in that a genuine orphan close (no opening tag) is still stripped.pytest tests/basic/test_reasoning.py tests/basic/test_models.py— 34 passed (incl. the existingtest_remove_reasoning_contentand the integration tests that drivecoder.remove_reasoning_content()with mocked streaming).pre-commit run --files aider/reasoning_tags.py tests/basic/test_reasoning.py— isort / black / flake8 / codespell all pass.