QVAC-19908 fix: recover Qwen hybrid tool-call frames#2677
Merged
simon-iribarren merged 4 commits intoJun 18, 2026
Merged
Conversation
Document why repairFunctionEqualsJson exists: Qwen3.5/3.6 can fuse its
XML and JSON tool templates into a single `{"function=NAME","arguments":{...}}`
frame, and the repair is intentionally scoped to that exact shape so
well-formed JSON frames are never rewritten.
arun-mani-j
approved these changes
Jun 18, 2026
opaninakuffo
approved these changes
Jun 18, 2026
Contributor
Author
|
/review |
Contributor
Tier-based Approval Status |
simon-iribarren
added a commit
to simon-iribarren/qvac
that referenced
this pull request
Jun 18, 2026
* fix: recover Qwen hybrid tool-call frames (QVAC-19908)
* QVAC-19908 doc: explain Qwen hybrid tool-call frame origin
Document why repairFunctionEqualsJson exists: Qwen3.5/3.6 can fuse its
XML and JSON tool templates into a single `{"function=NAME","arguments":{...}}`
frame, and the repair is intentionally scoped to that exact shape so
well-formed JSON frames are never rewritten.
(cherry picked from commit 4b3ac99)
simon-iribarren
added a commit
to simon-iribarren/qvac
that referenced
this pull request
Jun 18, 2026
Fix-only patch release: recover malformed Qwen hybrid tool-call frames (tetherto#2677). Bumps @qvac/sdk and @qvac/bare-sdk to 0.13.4 and adds the 0.13.4 changelog.
simon-iribarren
added a commit
to simon-iribarren/qvac
that referenced
this pull request
Jun 18, 2026
Fix-only patch release: recover malformed Qwen hybrid tool-call frames (tetherto#2677). Bumps @qvac/sdk and @qvac/bare-sdk to 0.13.4 and adds the 0.13.4 changelog. (cherry picked from commit 92da258)
simon-iribarren
added a commit
to simon-iribarren/qvac
that referenced
this pull request
Jun 18, 2026
Fix-only patch release: recover malformed Qwen hybrid tool-call frames (tetherto#2677). Bumps @qvac/sdk and @qvac/bare-sdk to 0.13.4 and adds the 0.13.4 changelog.
simon-iribarren
added a commit
to simon-iribarren/qvac
that referenced
this pull request
Jun 18, 2026
Fix-only patch release: recover malformed Qwen hybrid tool-call frames (tetherto#2677). Bumps @qvac/sdk and @qvac/bare-sdk to 0.13.4 and adds the 0.13.4 changelog. (cherry picked from commit 8fbb222)
simon-iribarren
added a commit
that referenced
this pull request
Jun 18, 2026
Fix-only patch release: recover malformed Qwen hybrid tool-call frames (#2677). Bumps @qvac/sdk and @qvac/bare-sdk to 0.13.4 and adds the 0.13.4 changelog.
simon-iribarren
added a commit
that referenced
this pull request
Jun 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🎯 What problem does this PR solve?
Qwen3.5/3.6 can emit a malformed hybrid tool-call frame that fuses its two
tool templates into one. The model normally emits either:
<tool_call><function=NAME><parameter=KEY>VALUE</parameter></function></tool_call><tool_call>{"name":NAME,"arguments":{...}}</tool_call>When it blends them, it drops the XML
<function=NAME>token verbatim into aJSON envelope, producing an invalid frame whose object key is the bare string
"function=NAME":This is not valid JSON (a string key with no value), so
JSON.parsethrows.The frame then surfaces as a
PARSE_ERRORand no structuredtoolCalleventis emitted — callers see the raw
<tool_call>markup as assistant text andthe tool is never invoked.
📝 How does it solve it?
(
packages/sdk/server/utils/tools/parsers/hermes.ts) for the observed"function=NAME"object-key shape, rewriting it to a canonical{"name":NAME,"arguments":{...}}frame before validation.already route: for
qwen35, the parser chain is[parseQwen35Format, parseHermesFormat].parseQwen35Formatonly claims aframe when it contains the
<function=angle-bracket token; the hybridhas
function=without the bracket and starts with{, so it is treated as aJSON-looking frame and deferred to the Hermes parser. The Qwen XML parser
behaviour is unchanged.
incomplete-frame (cutoff/abort) recovery path, and works per-frame so multiple
tool calls in one message are each recovered.
Why the repair is intentionally narrow
The rewrite is anchored to the exact observed shape and re-serializes the name
with
JSON.stringify, so well-formed JSON frames are never touched and amalformed
argumentspayload still surfaces as a cleanPARSE_ERRORratherthan crashing the parser. The following neighbouring shapes are deliberately
out of scope (no evidence the model emits them, and broadening would risk
accepting genuinely malformed output):
{"function":"NAME","arguments":{...}}— valid JSON, wrong key{"function":{"name":NAME,"arguments":{...}}}— nested OpenAI envelope{"function=NAME","parameters":{...}}—parametersinstead ofargumentsIf those start appearing in the wild, they can be added as additional anchored
cases.
🧪 How was it tested?
parseToolCalls(..., "qwen35")recovering awebfetchcall from the"function=webfetch"hybrid frame.errors.length === 1,toolCalls.length === 0) and passes after it.bun run test/unit/tool-parser.test.tsinpackages/sdk: 85/85 pass.intended: the hybrid frame, multi-frame messages, and the incomplete/cutoff
frame all recover; the out-of-scope shapes listed above remain unrecovered;
and canonical
{"name":...}frames are unaffected.📚 Background: why these frames appear (model specificity + runtime)
This is not a one-off corruption — it is a well-documented, model-specific
behaviour of the Qwen3.5/3.6 family, and the recovery here is the same class of
mitigation other local-LLM ecosystems have adopted.
<tool_call><function=NAME><parameter=KEY>VALUE</parameter></function></tool_call>),not JSON. The Qwen team confirms this is by design and that
framework parsers are expected to convert the XML into structured JSON
(QwenLM/Qwen3.6#125).
most stable, 9B the least, 35B-A3B in between; the model also intermittently
emits the tool call inside the reasoning/thinking channel, leaving
finish_reason="stop"with emptytool_calls. The maintainers state amodel-side fix was not available for 3.5 and recommend handling thought-state
/ parsing at the application layer (it is reportedly fixed in Qwen3.6).
{"function=NAME","arguments":{...}}frame this PR recovers is thecross-contamination of those two native formats — the XML
<function=NAME>token bleeding into the JSON envelope — so it sits squarely in the
documented failure space rather than being an isolated glitch.
Other ecosystems hit and patched the exact same thing:
loops and ships a
recover_garbled_tool_json()recovery plus retries and alower sampling temperature
(reliability notes).
(
qwen3.5-enhanced.jinja) plus theqwen3_xmlparser — which defers parsinguntil the full parameter block arrives and auto-heals missing closing tags —
over
qwen3_coder, which corrupts on streamed nested JSON / truncation(allanchan339/vLLM-Qwen3-3.5-3.6-chat-template-fix,
NVIDIA DGX forum,
r/LocalLLaMA thread).
🧭 Path forward for addon teams (lower-level handling)
This SDK-level repair is a deliberate, model-agnostic safety net. The more
robust long-term fixes belong below the SDK, in the inference addons / runtime,
and are worth tracking for the llamacpp addon team:
the model can only emit well-formed tool frames in the first place. This is
exactly the direction of the in-progress llama.cpp "autoparser" work, which
derives both a parser and a lazy grammar from the model's chat template and
already classifies the relevant shapes (
JSON_NATIVE,TAG_WITH_JSON,TAG_WITH_TAGGED, plus afun_name_is_keyflag)(ggml-org/llama.cpp
pwilkin:autoparser).XML tool format and keeps reasoning separate, so tool calls don't leak into
the thinking channel (the
qwen3.5-enhanced.jinjalesson above).qwen3_xmlapproach)rather than streaming JSON parsing that corrupts on partial frames.
improves format consistency (claude-code-local uses 0.2).
Until the runtime/grammar path lands, keeping this recovery as defense-in-depth
prevents silent tool-call drops for the affected Qwen builds.