QVAC-19908 fix: recover Qwen hybrid tool-call frames by simon-iribarren · Pull Request #2677 · tetherto/qvac

simon-iribarren · 2026-06-17T19:52:44Z

🎯 What problem does this PR solve?

Qwen3.5/3.6 can emit a malformed hybrid tool-call frame that fuses its two
tool templates into one. The model normally emits either:

Pythonic-XML: <tool_call><function=NAME><parameter=KEY>VALUE</parameter></function></tool_call>
Hermes JSON: <tool_call>{"name":NAME,"arguments":{...}}</tool_call>

When it blends them, it drops the XML <function=NAME> token verbatim into a
JSON envelope, producing an invalid frame whose object key is the bare string
"function=NAME":

<tool_call>
{"function=webfetch","arguments":{"url":"https://docs.opencode.ai","format":"markdown"}}
</tool_call>

This is not valid JSON (a string key with no value), so JSON.parse throws.
The frame then surfaces as a PARSE_ERROR and no structured toolCall event
is emitted — callers see the raw <tool_call> markup as assistant text and
the tool is never invoked.

📝 How does it solve it?

Adds a narrow recovery path in the Hermes JSON-frame parser
(packages/sdk/server/utils/tools/parsers/hermes.ts) for the observed
"function=NAME" object-key shape, rewriting it to a canonical
{"name":NAME,"arguments":{...}} frame before validation.
The recovery lives at the JSON-frame layer because that is where these frames
already route: for qwen35, the parser chain is
[parseQwen35Format, parseHermesFormat]. parseQwen35Format only claims a
frame when it contains the <function= angle-bracket token; the hybrid
has function= without the bracket and starts with {, so it is treated as a
JSON-looking frame and deferred to the Hermes parser. The Qwen XML parser
behaviour is unchanged.
The repair is applied to both the complete-frame path and the
incomplete-frame (cutoff/abort) recovery path, and works per-frame so multiple
tool calls in one message are each recovered.

Why the repair is intentionally narrow

The rewrite is anchored to the exact observed shape and re-serializes the name
with JSON.stringify, so well-formed JSON frames are never touched and a
malformed arguments payload still surfaces as a clean PARSE_ERROR rather
than crashing the parser. The following neighbouring shapes are deliberately
out of scope (no evidence the model emits them, and broadening would risk
accepting genuinely malformed output):

{"function":"NAME","arguments":{...}} — valid JSON, wrong key
{"function":{"name":NAME,"arguments":{...}}} — nested OpenAI envelope
{"function=NAME","parameters":{...}} — parameters instead of arguments
reversed key order

If those start appearing in the wild, they can be added as additional anchored
cases.

🧪 How was it tested?

Added a regression test for parseToolCalls(..., "qwen35") recovering a
webfetch call from the "function=webfetch" hybrid frame.
Verified the test fails before the parser change (errors.length === 1,
toolCalls.length === 0) and passes after it.
bun run test/unit/tool-parser.test.ts in packages/sdk: 85/85 pass.
Manually exercised neighbouring shapes to confirm the recovery is scoped as
intended: the hybrid frame, multi-frame messages, and the incomplete/cutoff
frame all recover; the out-of-scope shapes listed above remain unrecovered;
and canonical {"name":...} frames are unaffected.

📚 Background: why these frames appear (model specificity + runtime)

This is not a one-off corruption — it is a well-documented, model-specific
behaviour of the Qwen3.5/3.6 family, and the recovery here is the same class of
mitigation other local-LLM ecosystems have adopted.

Qwen3.5 natively emits an XML-style tool-call format (<tool_call><function=NAME><parameter=KEY>VALUE</parameter></function></tool_call>),
not JSON. The Qwen team confirms this is by design and that
framework parsers are expected to convert the XML into structured JSON
(QwenLM/Qwen3.6#125).
Reliability is size/quant-dependent: in that same issue, Qwen3.5-4B is the
most stable, 9B the least, 35B-A3B in between; the model also intermittently
emits the tool call inside the reasoning/thinking channel, leaving
finish_reason="stop" with empty tool_calls. The maintainers state a
model-side fix was not available for 3.5 and recommend handling thought-state
/ parsing at the application layer (it is reportedly fixed in Qwen3.6).
The {"function=NAME","arguments":{...}} frame this PR recovers is the
cross-contamination of those two native formats — the XML <function=NAME>
token bleeding into the JSON envelope — so it sits squarely in the
documented failure space rather than being an isolated glitch.

Other ecosystems hit and patched the exact same thing:

claude-code-local documents identical XML-in-JSON hybrids causing re-prompt
loops and ships a recover_garbled_tool_json() recovery plus retries and a
lower sampling temperature
(reliability notes).
The vLLM/SGLang community converged on a corrected chat template
(qwen3.5-enhanced.jinja) plus the qwen3_xml parser — which defers parsing
until the full parameter block arrives and auto-heals missing closing tags —
over qwen3_coder, which corrupts on streamed nested JSON / truncation
(allanchan339/vLLM-Qwen3-3.5-3.6-chat-template-fix,
NVIDIA DGX forum,
r/LocalLLaMA thread).

🧭 Path forward for addon teams (lower-level handling)

This SDK-level repair is a deliberate, model-agnostic safety net. The more
robust long-term fixes belong below the SDK, in the inference addons / runtime,
and are worth tracking for the llamacpp addon team:

Grammar-constrained decoding. Constrain generation with a GBNF grammar so
the model can only emit well-formed tool frames in the first place. This is
exactly the direction of the in-progress llama.cpp "autoparser" work, which
derives both a parser and a lazy grammar from the model's chat template and
already classifies the relevant shapes (JSON_NATIVE, TAG_WITH_JSON,
TAG_WITH_TAGGED, plus a fun_name_is_key flag)
(ggml-org/llama.cpp pwilkin:autoparser).
Template correctness. Ship a chat template that matches the model's native
XML tool format and keeps reasoning separate, so tool calls don't leak into
the thinking channel (the qwen3.5-enhanced.jinja lesson above).
Deferred / auto-healing parsing at the runtime (the qwen3_xml approach)
rather than streaming JSON parsing that corrupts on partial frames.
Sampling for tool turns. Lower temperature for tool-heavy steps measurably
improves format consistency (claude-code-local uses 0.2).

Until the runtime/grammar path lands, keeping this recovery as defense-in-depth
prevents silent tool-call drops for the affected Qwen builds.

Document why repairFunctionEqualsJson exists: Qwen3.5/3.6 can fuse its XML and JSON tool templates into a single `{"function=NAME","arguments":{...}}` frame, and the repair is intentionally scoped to that exact shape so well-formed JSON frames are never rewritten.

simon-iribarren · 2026-06-18T08:01:05Z

/review

github-actions · 2026-06-18T08:01:39Z

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ✅ APPROVED

**Requirements:**
- 1 Team Member approval ✅ (1/1)
- 1 Team Lead OR Management approval ✅ (1/1)



---
*This comment is automatically updated when reviews change.*

* fix: recover Qwen hybrid tool-call frames (QVAC-19908) * QVAC-19908 doc: explain Qwen hybrid tool-call frame origin Document why repairFunctionEqualsJson exists: Qwen3.5/3.6 can fuse its XML and JSON tool templates into a single `{"function=NAME","arguments":{...}}` frame, and the repair is intentionally scoped to that exact shape so well-formed JSON frames are never rewritten. (cherry picked from commit 4b3ac99)

Fix-only patch release: recover malformed Qwen hybrid tool-call frames (tetherto#2677). Bumps @qvac/sdk and @qvac/bare-sdk to 0.13.4 and adds the 0.13.4 changelog.

Fix-only patch release: recover malformed Qwen hybrid tool-call frames (tetherto#2677). Bumps @qvac/sdk and @qvac/bare-sdk to 0.13.4 and adds the 0.13.4 changelog. (cherry picked from commit 92da258)

Fix-only patch release: recover malformed Qwen hybrid tool-call frames (tetherto#2677). Bumps @qvac/sdk and @qvac/bare-sdk to 0.13.4 and adds the 0.13.4 changelog.

Fix-only patch release: recover malformed Qwen hybrid tool-call frames (tetherto#2677). Bumps @qvac/sdk and @qvac/bare-sdk to 0.13.4 and adds the 0.13.4 changelog. (cherry picked from commit 8fbb222)

Fix-only patch release: recover malformed Qwen hybrid tool-call frames (#2677). Bumps @qvac/sdk and @qvac/bare-sdk to 0.13.4 and adds the 0.13.4 changelog.

Fix-only patch release: recover malformed Qwen hybrid tool-call frames (#2677). Bumps @qvac/sdk and @qvac/bare-sdk to 0.13.4 and adds the 0.13.4 changelog. (cherry picked from commit 8fbb222)

fix: recover Qwen hybrid tool-call frames (QVAC-19908)

c9525de

simon-iribarren requested review from a team as code owners June 17, 2026 19:52

simon-iribarren added tier1 verified Authorize secrets / label-gate in PR workflows labels Jun 17, 2026

simon-iribarren mentioned this pull request Jun 17, 2026

QVAC-19908 chore: release opencode-plugin 0.1.0 #2676

Merged

simon-iribarren temporarily deployed to release June 17, 2026 19:53 — with GitHub Actions Inactive

simon-iribarren temporarily deployed to release June 17, 2026 19:54 — with GitHub Actions Inactive

Merge branch 'main' into fix/qvac-19908-qwen-hybrid-tool-call

d0350d2

simon-iribarren temporarily deployed to release June 17, 2026 20:06 — with GitHub Actions Inactive

simon-iribarren temporarily deployed to release June 17, 2026 20:24 — with GitHub Actions Inactive

simon-iribarren temporarily deployed to release June 17, 2026 20:25 — with GitHub Actions Inactive

arun-mani-j approved these changes Jun 18, 2026

View reviewed changes

opaninakuffo approved these changes Jun 18, 2026

View reviewed changes

Merge branch 'main' into fix/qvac-19908-qwen-hybrid-tool-call

6405ccd

simon-iribarren temporarily deployed to release June 18, 2026 07:59 — with GitHub Actions Inactive

simon-iribarren merged commit 4b3ac99 into tetherto:main Jun 18, 2026
18 checks passed

simon-iribarren temporarily deployed to release June 18, 2026 08:03 — with GitHub Actions Inactive

simon-iribarren mentioned this pull request Jun 18, 2026

QVAC-19908 chore: release sdk + bare-sdk 0.13.4 — Qwen tool-call recovery, subscribeServerLogs, validation errors #2679

Closed

simon-iribarren mentioned this pull request Jun 18, 2026

QVAC-19908 chore: release sdk + bare-sdk 0.13.4 — recover Qwen hybrid tool-call frames #2681

Merged

simon-iribarren mentioned this pull request Jun 18, 2026

QVAC-19908 chore[skiplog]: backmerge release-sdk-0.13.4 — version bump + changelog #2680

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

QVAC-19908 fix: recover Qwen hybrid tool-call frames#2677

QVAC-19908 fix: recover Qwen hybrid tool-call frames#2677
simon-iribarren merged 4 commits into
tetherto:mainfrom
simon-iribarren:fix/qvac-19908-qwen-hybrid-tool-call

simon-iribarren commented Jun 17, 2026 •

edited

Loading

Uh oh!

simon-iribarren commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

simon-iribarren commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎯 What problem does this PR solve?

📝 How does it solve it?

Why the repair is intentionally narrow

🧪 How was it tested?

📚 Background: why these frames appear (model specificity + runtime)

🧭 Path forward for addon teams (lower-level handling)

Uh oh!

simon-iribarren commented Jun 18, 2026

Uh oh!

github-actions Bot commented Jun 18, 2026

Tier-based Approval Status

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

simon-iribarren commented Jun 17, 2026 •

edited

Loading