Skip to content

[TRTLLM-10077][feat] Add 'auto' option for tool and reasoning parsers#12104

Open
JunyiXu-nv wants to merge 1 commit intoNVIDIA:mainfrom
JunyiXu-nv:dev-junyix-feat-serve-auto-parser
Open

[TRTLLM-10077][feat] Add 'auto' option for tool and reasoning parsers#12104
JunyiXu-nv wants to merge 1 commit intoNVIDIA:mainfrom
JunyiXu-nv:dev-junyix-feat-serve-auto-parser

Conversation

@JunyiXu-nv
Copy link
Collaborator

@JunyiXu-nv JunyiXu-nv commented Mar 11, 2026

Add automatic parser selection for --tool_parser and --reasoning_parser based on the model's HF config model_type.

Tool parser auto-detection mapping:

  • qwen2/qwen3/qwen3_moe/qwen3_5/qwen3_5_moe/qwen3_next -> qwen3
  • deepseek_v3 -> deepseek_v3
  • deepseek_v32 -> deepseek_v32
  • kimi_k2/kimi_k25 -> kimi_k2
  • glm4 -> glm4

Reasoning parser auto-detection mapping:

  • qwen3/qwen3_moe/qwen3_5/qwen3_5_moe/qwen3_next -> qwen3
  • deepseek_v3/deepseek_v32 -> deepseek-r1 (only if model name contains 'R1')
  • nemotron_h -> nano-v3

For unrecognized models, a clear error is displayed listing supported model types and available parsers.

Made-with: Cursor

Summary by CodeRabbit

  • New Features
    • Enhanced the serve command with automatic detection capabilities for tool parser and reasoning parser configuration. Users can now set these options to "auto" to enable intelligent, model-aware parser selection without manual specification.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Add automatic parser selection for --tool_parser and --reasoning_parser
based on the model's HF config model_type.

Tool parser auto-detection mapping:
- qwen2/qwen3/qwen3_moe/qwen3_5/qwen3_5_moe/qwen3_next -> qwen3
- deepseek_v3 -> deepseek_v3
- deepseek_v32 -> deepseek_v32
- kimi_k2/kimi_k25 -> kimi_k2
- glm4 -> glm4

Reasoning parser auto-detection mapping:
- qwen3/qwen3_moe/qwen3_5/qwen3_5_moe/qwen3_next -> qwen3
- deepseek_v3/deepseek_v32 -> deepseek-r1 (only if model name contains 'R1')
- nemotron_h -> nano-v3

For unrecognized models, a clear error is displayed listing supported
model types and available parsers.

Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Made-with: Cursor
@JunyiXu-nv JunyiXu-nv requested review from LinPoly and arysef March 11, 2026 07:39
@JunyiXu-nv JunyiXu-nv requested a review from a team as a code owner March 11, 2026 07:39
@JunyiXu-nv JunyiXu-nv requested review from QiJune and syuoni March 11, 2026 07:39
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

The changes introduce auto-detection support for tool and reasoning parsers in the serve command by adding resolver functions that read model configuration files, map model types to appropriate parsers, and integrate these results into the CLI workflow with "auto" as a new option.

Changes

Cohort / File(s) Summary
Serve Command Auto-Detection
tensorrt_llm/commands/serve.py
Adds auto-detection wiring for tool_parser and reasoning_parser CLI options. Imports resolver functions and extends CLI choices to include "auto". When "auto" is selected, calls respective resolver functions to detect appropriate parser and logs results, with error handling via BadParameter if detection fails.
Reasoning Parser Resolution
tensorrt_llm/llmapi/reasoning_parser.py
Introduces MODEL_TYPE_TO_REASONING_PARSER mapping and resolve_auto_reasoning_parser() function. Reads model config.json and maps model_type to appropriate parser. Includes DeepSeek-specific logic requiring "r1" in path for v3/v32 models. Updates ReasoningParserFactory.parsers to include "nano-v3" and "qwen3" mappings.
Tool Parser Resolution
tensorrt_llm/serve/tool_parser/tool_parser_factory.py
Introduces MODEL_TYPE_TO_TOOL_PARSER mapping and resolve_auto_tool_parser() function. Reads config.json from model path, extracts model_type, and returns corresponding tool parser via mapping. Returns None if config.json is missing.

Sequence Diagram

sequenceDiagram
    actor User
    participant CLI as Serve Command<br/>(serve.py)
    participant ReasoningResolver as Reasoning Resolver<br/>(reasoning_parser.py)
    participant ToolResolver as Tool Resolver<br/>(tool_parser_factory.py)
    participant ConfigReader as Model Config
    participant Server as Server

    User->>CLI: Serve with reasoning_parser="auto"
    CLI->>ReasoningResolver: Call resolve_auto_reasoning_parser(model)
    ReasoningResolver->>ConfigReader: Read config.json
    ConfigReader-->>ReasoningResolver: model_type value
    ReasoningResolver->>ReasoningResolver: Map model_type to parser<br/>(with DeepSeek checks)
    ReasoningResolver-->>CLI: Return parser name or None
    CLI->>CLI: Log auto-detection result

    User->>CLI: Serve with tool_parser="auto"
    CLI->>ToolResolver: Call resolve_auto_tool_parser(model)
    ToolResolver->>ConfigReader: Read config.json
    ConfigReader-->>ToolResolver: model_type value
    ToolResolver->>ToolResolver: Map model_type to parser
    ToolResolver-->>CLI: Return parser name or None
    CLI->>CLI: Log auto-detection result

    CLI->>Server: Configure with detected parsers
    Server-->>User: Server ready
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive PR description provides clear details about auto-detection mappings for tool and reasoning parsers but lacks PR title format, detailed test coverage, and explicit checklist validation. Add a properly formatted PR title following [JIRA/Issue][type] format, document specific test cases for auto-detection logic, and explicitly verify checklist items especially around test coverage and coding guidelines.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main feature—adding automatic parser selection ('auto' option) for tool and reasoning parsers based on model type.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (5)
tensorrt_llm/serve/tool_parser/tool_parser_factory.py (2)

29-39: Consider extracting shared config-loading logic.

Both resolve_auto_tool_parser and resolve_auto_reasoning_parser (in reasoning_parser.py) share identical config-loading logic. Consider extracting a shared helper function (e.g., get_model_type_from_config(model: str) -> Optional[str]) to reduce duplication.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/serve/tool_parser/tool_parser_factory.py` around lines 29 - 39,
Both resolve_auto_tool_parser and resolve_auto_reasoning_parser duplicate the
same config-loading logic; extract that into a shared helper (e.g.,
get_model_type_from_config(model: str) -> Optional[str]) that opens
Path(model)/"config.json", returns None if missing or on parse errors, and
returns config.get("model_type",""); then update resolve_auto_tool_parser and
resolve_auto_reasoning_parser to call get_model_type_from_config and use
MODEL_TYPE_TO_TOOL_PARSER.get(model_type) (or the equivalent lookup) so the
file-IO and JSON parsing are centralized.

35-36: Add explicit encoding and handle JSON parsing errors.

Same suggestion as for resolve_auto_reasoning_parser: specify encoding="utf-8" and handle json.JSONDecodeError.

♻️ Proposed fix
-    with open(config_path) as f:
-        config = json.load(f)
+    with open(config_path, encoding="utf-8") as f:
+        try:
+            config = json.load(f)
+        except json.JSONDecodeError:
+            return None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/serve/tool_parser/tool_parser_factory.py` around lines 35 - 36,
The code that reads the JSON config using open(config_path) and json.load lacks
explicit encoding and JSON parsing error handling; update the file-reading in
the function that loads the tool parser (the block using config_path and
json.load) to open(config_path, encoding="utf-8") and wrap the json.load call in
a try/except catching json.JSONDecodeError, then raise or log a clearer error
(including config_path and the decode error) so callers (or
resolve_auto_reasoning_parser) get actionable feedback.
tensorrt_llm/llmapi/reasoning_parser.py (1)

132-133: Add explicit encoding and handle JSON parsing errors.

The open() call should specify encoding="utf-8" for consistency across platforms. Additionally, consider handling json.JSONDecodeError to provide a clearer error message if the config file is malformed.

♻️ Proposed fix
-    with open(config_path) as f:
-        config = json.load(f)
+    with open(config_path, encoding="utf-8") as f:
+        try:
+            config = json.load(f)
+        except json.JSONDecodeError:
+            return None

As per coding guidelines: "When using try-except blocks in Python, limit the except to the smallest set of errors possible."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/llmapi/reasoning_parser.py` around lines 132 - 133, Open the
config file with an explicit encoding by changing the open call that uses
config_path to open(config_path, encoding="utf-8"), and wrap the json.load(f)
call in a try/except that catches json.JSONDecodeError (and optionally
FileNotFoundError) to log or raise a clearer error about a malformed or missing
config; update the error handling near the json.load usage so the exception
message references config_path for context and does not blanket-catch broad
exceptions.
tensorrt_llm/commands/serve.py (2)

770-796: Error message lists may drift from source of truth.

The supported model types listed in the error messages are hardcoded strings. If MODEL_TYPE_TO_TOOL_PARSER or MODEL_TYPE_TO_REASONING_PARSER are updated later, these error messages could become stale.

Consider generating the list dynamically from the mapping keys:

♻️ Proposed fix for tool_parser error message
+from tensorrt_llm.serve.tool_parser.tool_parser_factory import MODEL_TYPE_TO_TOOL_PARSER
+from tensorrt_llm.llmapi.reasoning_parser import MODEL_TYPE_TO_REASONING_PARSER

Then in the error handling:

         raise click.BadParameter(
             f"Cannot auto-detect tool parser for model '{model}'. "
-            f"Supported model types for auto-detection: qwen2, qwen3, "
-            f"qwen3_moe, qwen3_5, qwen3_5_moe, qwen3_next, deepseek_v3, "
-            f"deepseek_v32, kimi_k2, kimi_k25, glm4. "
+            f"Supported model types for auto-detection: {', '.join(MODEL_TYPE_TO_TOOL_PARSER.keys())}. "
             f"Please specify a parser explicitly: "
             f"{list(ToolParserFactory.parsers.keys())}",
             param_hint="--tool_parser")
         raise click.BadParameter(
             f"Cannot auto-detect reasoning parser for model '{model}'. "
-            f"Supported model types for auto-detection: qwen3, qwen3_moe, "
-            f"qwen3_5, qwen3_5_moe, qwen3_next, deepseek_v3 (R1 only), "
-            f"deepseek_v32 (R1 only), nemotron_h. "
+            f"Supported model types for auto-detection: {', '.join(MODEL_TYPE_TO_REASONING_PARSER.keys())} "
+            f"(deepseek_v3/v32 require 'R1' in model name). "
             f"Please specify a parser explicitly: "
             f"{list(ReasoningParserFactory.parsers.keys())}",
             param_hint="--reasoning_parser")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/commands/serve.py` around lines 770 - 796, The error strings
list hardcoded supported model types and can drift; update the two bad-parameter
messages to generate the supported-model list dynamically from the canonical
mappings instead of hardcoded text: for the tool parser error use the keys from
MODEL_TYPE_TO_TOOL_PARSER (or fall back to
list(ToolParserFactory.parsers.keys())) when building the message in the
resolve_auto_tool_parser failure branch (function/section around
resolve_auto_tool_parser and ToolParserFactory.parsers), and similarly for the
reasoning parser error use MODEL_TYPE_TO_REASONING_PARSER keys (or fall back to
list(ReasoningParserFactory.parsers.keys())) in the
resolve_auto_reasoning_parser failure branch (function/section around
resolve_auto_reasoning_parser and ReasoningParserFactory.parsers); keep the rest
of the exception text the same but replace the hardcoded lists with these
dynamically generated lists.

665-665: Consider using iterable unpacking for cleaner syntax.

Static analysis suggests using iterable unpacking instead of list concatenation.

♻️ Proposed fix
 `@click.option`(
     "--reasoning_parser",
-    type=click.Choice(["auto"] + list(ReasoningParserFactory.parsers.keys())),
+    type=click.Choice(["auto", *ReasoningParserFactory.parsers.keys()]),
     default=None,
 `@click.option`(
     "--tool_parser",
-    type=click.Choice(["auto"] + list(ToolParserFactory.parsers.keys())),
+    type=click.Choice(["auto", *ToolParserFactory.parsers.keys()]),
     default=None,

Also applies to: 673-673

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/commands/serve.py` at line 665, Replace the list concatenation
used in the click.Choice call with iterable unpacking: instead of building
choices via ["auto"] + list(ReasoningParserFactory.parsers.keys()), pass a
choices iterable that prefixes the literal "auto" and then unpacks
ReasoningParserFactory.parsers.keys() into the Choice constructor (apply the
same change to the other occurrence around the code marked at lines 673); update
the expression where type=click.Choice(...) is used so it constructs the choices
via unpacking rather than concatenation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tensorrt_llm/commands/serve.py`:
- Around line 770-796: The error strings list hardcoded supported model types
and can drift; update the two bad-parameter messages to generate the
supported-model list dynamically from the canonical mappings instead of
hardcoded text: for the tool parser error use the keys from
MODEL_TYPE_TO_TOOL_PARSER (or fall back to
list(ToolParserFactory.parsers.keys())) when building the message in the
resolve_auto_tool_parser failure branch (function/section around
resolve_auto_tool_parser and ToolParserFactory.parsers), and similarly for the
reasoning parser error use MODEL_TYPE_TO_REASONING_PARSER keys (or fall back to
list(ReasoningParserFactory.parsers.keys())) in the
resolve_auto_reasoning_parser failure branch (function/section around
resolve_auto_reasoning_parser and ReasoningParserFactory.parsers); keep the rest
of the exception text the same but replace the hardcoded lists with these
dynamically generated lists.
- Line 665: Replace the list concatenation used in the click.Choice call with
iterable unpacking: instead of building choices via ["auto"] +
list(ReasoningParserFactory.parsers.keys()), pass a choices iterable that
prefixes the literal "auto" and then unpacks
ReasoningParserFactory.parsers.keys() into the Choice constructor (apply the
same change to the other occurrence around the code marked at lines 673); update
the expression where type=click.Choice(...) is used so it constructs the choices
via unpacking rather than concatenation.

In `@tensorrt_llm/llmapi/reasoning_parser.py`:
- Around line 132-133: Open the config file with an explicit encoding by
changing the open call that uses config_path to open(config_path,
encoding="utf-8"), and wrap the json.load(f) call in a try/except that catches
json.JSONDecodeError (and optionally FileNotFoundError) to log or raise a
clearer error about a malformed or missing config; update the error handling
near the json.load usage so the exception message references config_path for
context and does not blanket-catch broad exceptions.

In `@tensorrt_llm/serve/tool_parser/tool_parser_factory.py`:
- Around line 29-39: Both resolve_auto_tool_parser and
resolve_auto_reasoning_parser duplicate the same config-loading logic; extract
that into a shared helper (e.g., get_model_type_from_config(model: str) ->
Optional[str]) that opens Path(model)/"config.json", returns None if missing or
on parse errors, and returns config.get("model_type",""); then update
resolve_auto_tool_parser and resolve_auto_reasoning_parser to call
get_model_type_from_config and use MODEL_TYPE_TO_TOOL_PARSER.get(model_type) (or
the equivalent lookup) so the file-IO and JSON parsing are centralized.
- Around line 35-36: The code that reads the JSON config using open(config_path)
and json.load lacks explicit encoding and JSON parsing error handling; update
the file-reading in the function that loads the tool parser (the block using
config_path and json.load) to open(config_path, encoding="utf-8") and wrap the
json.load call in a try/except catching json.JSONDecodeError, then raise or log
a clearer error (including config_path and the decode error) so callers (or
resolve_auto_reasoning_parser) get actionable feedback.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6b8d21a9-58e3-4d1a-a0d4-bb79fb2a7b0b

📥 Commits

Reviewing files that changed from the base of the PR and between f7255e0 and 4144968.

📒 Files selected for processing (3)
  • tensorrt_llm/commands/serve.py
  • tensorrt_llm/llmapi/reasoning_parser.py
  • tensorrt_llm/serve/tool_parser/tool_parser_factory.py

@JunyiXu-nv
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #38561 [ run ] triggered by Bot. Commit: 4144968 Link to invocation

@tensorrt-cicd
Copy link
Collaborator

PR_Github #38561 [ run ] completed with state SUCCESS. Commit: 4144968
/LLM/main/L0_MergeRequest_PR pipeline #29903 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants