QVAC-18717 feat[api]: add Qwen3.5, Gemma4 tool-call dialects and reasoning_budget param#1974
Merged
gianni-cor merged 19 commits intoMay 12, 2026
Conversation
…t param
- Extend toolDialectSchema with 'qwen35' and 'gemma4' values
- Add Qwen3.5 Pythonic-XML parser (qwen35.ts): <tool_call><function=NAME>
<parameter=KEY>VALUE</parameter></function></tool_call>; string values are
raw text, arrays/objects are JSON; type coercion from tool schema
- Add Gemma4 native parser (gemma4native.ts): <|tool_call>call:NAME{...}<tool_call|>;
JS-literal args with <|"|> quote tokens, split-then-transliterate approach
to safely quote bare keys without corrupting string values containing ', key:'
- Wire both parsers into parser.ts dispatch and the default catch-all chain
- Add dialect specs to completion-normalizer.ts: qwen35 reuses <tool_call>
framing; gemma4 has asymmetric <|tool_call>/<tool_call|> + thinking frames
- Auto-detect qwen35/gemma4 from model name/path in dialect.ts with guards
against Gemma3+Q4 quant suffix and Qwen3 5B parameter-count collisions
- Add reasoning_budget (-1 | 0) to LlmConfig (load-time) and GenerationParams
(per-request); passes through transformLlmConfig unchanged (snake_case key
bypasses camelCase regex, number-to-string conversion handles the value)
- Mirror reasoning_budget in CLI SDKGenerationParams type
- Add tests-qvac completion tests for reasoning_budget passthrough
- Add tool-calling examples for qwen35 and gemma4 in examples/tools/
- Bump @qvac/llm-llamacpp to ^0.20.0 (adds reasoning_budget and new model
support shipped in fabric-8189)
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…udget to completion-executor llamacpp 8189+ (in @qvac/llm-llamacpp@0.20.0) removed --system-prompt from its CLI argument parser. The SDK was forwarding system_prompt through transformLlmConfig causing all model loads to fail with 'invalid argument: --system-prompt'. system_prompt is JS-only: completion-stream.ts reads it to seed the conversation history. It has no meaning at the C++ level and must be excluded alongside modelType. Also mirrors reasoning_budget in completion-executor.ts GenerationParams so the new tests-qvac reasoning_budget tests type-check correctly.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…on tests - Drop the over-broad qwen.*3\.5 alternative from the qwen35 regex and tighten the lookahead to (?![a-z0-9]) so qwen3-50b-instruct no longer false-matches as qwen35 - Tighten gemma4 lookahead to (?=[^a-z0-9]|$) so gemma-40b no longer false-matches as gemma4 - Extract transformLlmConfig to transform.ts (no addon imports) so it can be unit-tested without the native addon loading - Add llm-plugin-transform.test.ts pinning that system_prompt and modelType are never forwarded to C++ and that reasoning_budget survives - Add negative test cases for qwen3-50b and gemma-40b to tool-parser.test.ts - Fix stale default-chain comment in parser.ts (was 'Harmony first', actual order is Gemma4 first) - Add inline justification for qwen35/gemma4 fallback asymmetry
…teToolsTest ToolsExecutor.generic now reads toolDialect (forwarded to completion()) and resourceKey (selects which loaded model to use) from test params. The createToolsTest helper accepts both as optional options, so dialect-specific e2e test definitions can be added once the model constants are available from update-models.
…ool names, add qwen35 to default parser chain
- coerceParamValue: reject empty/whitespace-only numeric params before Number() for both
number and integer types; Number("") === 0 caused silent semantic corruption
- gemma4native callRegex and bare-key quoting regex: broaden [A-Za-z_]\w* to
[A-Za-z_][\w-]* so hyphenated tool names (and param keys) are matched instead
of returning matched=false and leaking raw frame markers as contentDelta
- pickFormatParsers default chain: insert parseQwen35Format ahead of parseHermesFormat
so raw Qwen XML payloads are recovered when the model-name heuristic misses
- regression tests for all three cases
simon-iribarren
previously approved these changes
May 12, 2026
NamelsKing
previously approved these changes
May 12, 2026
Contributor
Author
|
/review |
32791a3
Contributor
|
/review |
opaninakuffo
approved these changes
May 12, 2026
NamelsKing
approved these changes
May 12, 2026
Contributor
|
/review |
opaninakuffo
added a commit
that referenced
this pull request
May 14, 2026
Bump @qvac/cli to 0.4.0 and add the v0.4.0 changelog set. Includes all 5 cli-scoped PRs landed on release-cli-0.4.0 since cli-v0.3.0: - QVAC-18677 feat[api]: qvac verify deps (#1969) - QVAC-18717 feat[api]: Qwen3.5 / Gemma4 tool-call dialects + reasoning_budget (#1974) - QVAC-18678 feat[api]: qvac verify bundle (#1984) - QVAC-18730 feat[api]: POST /v1/images/generations on qvac serve (#2008) - chore: consolidate PR templates and hide style note in HTML comment (#1924) PR #1924's title lacked a ticket or [notask], so the changelog generator's strict validator dropped it. It is added manually under the Chores section to keep the changelog truthful to what shipped on release-cli-0.4.0.
opaninakuffo
added a commit
that referenced
this pull request
May 14, 2026
Bump @qvac/cli to 0.4.0 and add the v0.4.0 changelog set. Includes all 5 cli-scoped PRs landed on release-cli-0.4.0 since cli-v0.3.0: - QVAC-18677 feat[api]: qvac verify deps (#1969) - QVAC-18717 feat[api]: Qwen3.5 / Gemma4 tool-call dialects + reasoning_budget (#1974) - QVAC-18678 feat[api]: qvac verify bundle (#1984) - QVAC-18730 feat[api]: POST /v1/images/generations on qvac serve (#2008) - chore: consolidate PR templates and hide style note in HTML comment (#1924) PR #1924's title lacked a ticket or [notask], so the changelog generator's strict validator dropped it. It is added manually under the Chores section to keep the changelog truthful to what shipped on release-cli-0.4.0. (cherry picked from commit 22462c8)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🎯 What problem does this PR solve?
@qvac/llm-llamacpp@0.20.0(llamacpp 8189+) broke all model loads:system_promptfromLlmConfigwas forwarded to the C++ arg parser as--system-prompt, which was removed in that release.reasoning_budgetparameter introduced in@qvac/llm-llamacpp@0.20.0.📝 How does it solve it?
<tool_call><function=NAME><parameter=KEY>VALUE</parameter></function></tool_call>. String values are raw text; arrays/objects are JSON-parsed; integers reject non-integer floats. Errors surface asPARSE_ERROR(matches hermes/pythonic pattern).<|tool_call>call:NAME{key:<|"|>val<|"|>,...}<tool_call|>. Splits on<|"|>delimiter, quotes bare keys only in structural parts so, key:patterns inside string values are never misquoted as object keys.parser.ts.completion-normalizer.ts: qwen35 reuses<tool_call>framing; gemma4 uses asymmetric<|tool_call>/<tool_call|>+ thinking channel frames.qwen35/gemma4from model name/path indialect.tswith guards againstQ4_K_M/5bquantization/size suffix collision and Qwen3 5B parameter-count collision.reasoning_budget: -1 | 0toLlmConfig(load-time) andGenerationParams(per-request). Passes throughtransformLlmConfigunchanged.reasoning_budgetasbooleanin the CLISDKGenerationParamsinterface (true→-1,false→0);extractGenerationParamsparses it from the request body.system_promptbeing forwarded to the C++ arg parser:system_promptis JS-only (used bycompletion-stream.tsto seed conversation history). It is now excluded fromtransformLlmConfigalongsidemodelType.completion-reasoning-budget-disabledandcompletion-reasoning-budget-unrestrictedto tests-qvac.examples/tools/.toolDialectandresourceKeythroughToolsExecutorandcreateToolsTestso dialect-specific e2e tests can be added once model constants are available.@qvac/llm-llamacppto^0.20.0.🧪 How was it tested?
tool-parser.test.ts— includes regression tests for integer rejection, array/objectPARSE_ERRORpropagation, and all dialect negative-case coverage.translate.test.ts— includesreasoning_budgetboolean extraction tests.completion-reasoning-budget-disabledandcompletion-reasoning-budget-unrestrictedadded to e2e suite.llamacpp-tools-qwen35.tsandllamacpp-tools-gemma4.tsverified locally with Bare runtime.🔌 API Changes
Qwen3.5 / Qwen3.6 — dialect auto-detected from model name/path:
Gemma4 — dialect auto-detected from model name/path:
reasoning_budget— load-time default and per-request override: