[TRTLLM-10077][feat] Add 'auto' option for tool and reasoning parsers#12104
[TRTLLM-10077][feat] Add 'auto' option for tool and reasoning parsers#12104JunyiXu-nv wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
Add automatic parser selection for --tool_parser and --reasoning_parser based on the model's HF config model_type. Tool parser auto-detection mapping: - qwen2/qwen3/qwen3_moe/qwen3_5/qwen3_5_moe/qwen3_next -> qwen3 - deepseek_v3 -> deepseek_v3 - deepseek_v32 -> deepseek_v32 - kimi_k2/kimi_k25 -> kimi_k2 - glm4 -> glm4 Reasoning parser auto-detection mapping: - qwen3/qwen3_moe/qwen3_5/qwen3_5_moe/qwen3_next -> qwen3 - deepseek_v3/deepseek_v32 -> deepseek-r1 (only if model name contains 'R1') - nemotron_h -> nano-v3 For unrecognized models, a clear error is displayed listing supported model types and available parsers. Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com> Made-with: Cursor
📝 WalkthroughWalkthroughThe changes introduce auto-detection support for tool and reasoning parsers in the serve command by adding resolver functions that read model configuration files, map model types to appropriate parsers, and integrate these results into the CLI workflow with "auto" as a new option. Changes
Sequence DiagramsequenceDiagram
actor User
participant CLI as Serve Command<br/>(serve.py)
participant ReasoningResolver as Reasoning Resolver<br/>(reasoning_parser.py)
participant ToolResolver as Tool Resolver<br/>(tool_parser_factory.py)
participant ConfigReader as Model Config
participant Server as Server
User->>CLI: Serve with reasoning_parser="auto"
CLI->>ReasoningResolver: Call resolve_auto_reasoning_parser(model)
ReasoningResolver->>ConfigReader: Read config.json
ConfigReader-->>ReasoningResolver: model_type value
ReasoningResolver->>ReasoningResolver: Map model_type to parser<br/>(with DeepSeek checks)
ReasoningResolver-->>CLI: Return parser name or None
CLI->>CLI: Log auto-detection result
User->>CLI: Serve with tool_parser="auto"
CLI->>ToolResolver: Call resolve_auto_tool_parser(model)
ToolResolver->>ConfigReader: Read config.json
ConfigReader-->>ToolResolver: model_type value
ToolResolver->>ToolResolver: Map model_type to parser
ToolResolver-->>CLI: Return parser name or None
CLI->>CLI: Log auto-detection result
CLI->>Server: Configure with detected parsers
Server-->>User: Server ready
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (5)
tensorrt_llm/serve/tool_parser/tool_parser_factory.py (2)
29-39: Consider extracting shared config-loading logic.Both
resolve_auto_tool_parserandresolve_auto_reasoning_parser(inreasoning_parser.py) share identical config-loading logic. Consider extracting a shared helper function (e.g.,get_model_type_from_config(model: str) -> Optional[str]) to reduce duplication.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/serve/tool_parser/tool_parser_factory.py` around lines 29 - 39, Both resolve_auto_tool_parser and resolve_auto_reasoning_parser duplicate the same config-loading logic; extract that into a shared helper (e.g., get_model_type_from_config(model: str) -> Optional[str]) that opens Path(model)/"config.json", returns None if missing or on parse errors, and returns config.get("model_type",""); then update resolve_auto_tool_parser and resolve_auto_reasoning_parser to call get_model_type_from_config and use MODEL_TYPE_TO_TOOL_PARSER.get(model_type) (or the equivalent lookup) so the file-IO and JSON parsing are centralized.
35-36: Add explicit encoding and handle JSON parsing errors.Same suggestion as for
resolve_auto_reasoning_parser: specifyencoding="utf-8"and handlejson.JSONDecodeError.♻️ Proposed fix
- with open(config_path) as f: - config = json.load(f) + with open(config_path, encoding="utf-8") as f: + try: + config = json.load(f) + except json.JSONDecodeError: + return None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/serve/tool_parser/tool_parser_factory.py` around lines 35 - 36, The code that reads the JSON config using open(config_path) and json.load lacks explicit encoding and JSON parsing error handling; update the file-reading in the function that loads the tool parser (the block using config_path and json.load) to open(config_path, encoding="utf-8") and wrap the json.load call in a try/except catching json.JSONDecodeError, then raise or log a clearer error (including config_path and the decode error) so callers (or resolve_auto_reasoning_parser) get actionable feedback.tensorrt_llm/llmapi/reasoning_parser.py (1)
132-133: Add explicit encoding and handle JSON parsing errors.The
open()call should specifyencoding="utf-8"for consistency across platforms. Additionally, consider handlingjson.JSONDecodeErrorto provide a clearer error message if the config file is malformed.♻️ Proposed fix
- with open(config_path) as f: - config = json.load(f) + with open(config_path, encoding="utf-8") as f: + try: + config = json.load(f) + except json.JSONDecodeError: + return NoneAs per coding guidelines: "When using try-except blocks in Python, limit the except to the smallest set of errors possible."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/llmapi/reasoning_parser.py` around lines 132 - 133, Open the config file with an explicit encoding by changing the open call that uses config_path to open(config_path, encoding="utf-8"), and wrap the json.load(f) call in a try/except that catches json.JSONDecodeError (and optionally FileNotFoundError) to log or raise a clearer error about a malformed or missing config; update the error handling near the json.load usage so the exception message references config_path for context and does not blanket-catch broad exceptions.tensorrt_llm/commands/serve.py (2)
770-796: Error message lists may drift from source of truth.The supported model types listed in the error messages are hardcoded strings. If
MODEL_TYPE_TO_TOOL_PARSERorMODEL_TYPE_TO_REASONING_PARSERare updated later, these error messages could become stale.Consider generating the list dynamically from the mapping keys:
♻️ Proposed fix for tool_parser error message
+from tensorrt_llm.serve.tool_parser.tool_parser_factory import MODEL_TYPE_TO_TOOL_PARSER +from tensorrt_llm.llmapi.reasoning_parser import MODEL_TYPE_TO_REASONING_PARSERThen in the error handling:
raise click.BadParameter( f"Cannot auto-detect tool parser for model '{model}'. " - f"Supported model types for auto-detection: qwen2, qwen3, " - f"qwen3_moe, qwen3_5, qwen3_5_moe, qwen3_next, deepseek_v3, " - f"deepseek_v32, kimi_k2, kimi_k25, glm4. " + f"Supported model types for auto-detection: {', '.join(MODEL_TYPE_TO_TOOL_PARSER.keys())}. " f"Please specify a parser explicitly: " f"{list(ToolParserFactory.parsers.keys())}", param_hint="--tool_parser")raise click.BadParameter( f"Cannot auto-detect reasoning parser for model '{model}'. " - f"Supported model types for auto-detection: qwen3, qwen3_moe, " - f"qwen3_5, qwen3_5_moe, qwen3_next, deepseek_v3 (R1 only), " - f"deepseek_v32 (R1 only), nemotron_h. " + f"Supported model types for auto-detection: {', '.join(MODEL_TYPE_TO_REASONING_PARSER.keys())} " + f"(deepseek_v3/v32 require 'R1' in model name). " f"Please specify a parser explicitly: " f"{list(ReasoningParserFactory.parsers.keys())}", param_hint="--reasoning_parser")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/commands/serve.py` around lines 770 - 796, The error strings list hardcoded supported model types and can drift; update the two bad-parameter messages to generate the supported-model list dynamically from the canonical mappings instead of hardcoded text: for the tool parser error use the keys from MODEL_TYPE_TO_TOOL_PARSER (or fall back to list(ToolParserFactory.parsers.keys())) when building the message in the resolve_auto_tool_parser failure branch (function/section around resolve_auto_tool_parser and ToolParserFactory.parsers), and similarly for the reasoning parser error use MODEL_TYPE_TO_REASONING_PARSER keys (or fall back to list(ReasoningParserFactory.parsers.keys())) in the resolve_auto_reasoning_parser failure branch (function/section around resolve_auto_reasoning_parser and ReasoningParserFactory.parsers); keep the rest of the exception text the same but replace the hardcoded lists with these dynamically generated lists.
665-665: Consider using iterable unpacking for cleaner syntax.Static analysis suggests using iterable unpacking instead of list concatenation.
♻️ Proposed fix
`@click.option`( "--reasoning_parser", - type=click.Choice(["auto"] + list(ReasoningParserFactory.parsers.keys())), + type=click.Choice(["auto", *ReasoningParserFactory.parsers.keys()]), default=None,`@click.option`( "--tool_parser", - type=click.Choice(["auto"] + list(ToolParserFactory.parsers.keys())), + type=click.Choice(["auto", *ToolParserFactory.parsers.keys()]), default=None,Also applies to: 673-673
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/commands/serve.py` at line 665, Replace the list concatenation used in the click.Choice call with iterable unpacking: instead of building choices via ["auto"] + list(ReasoningParserFactory.parsers.keys()), pass a choices iterable that prefixes the literal "auto" and then unpacks ReasoningParserFactory.parsers.keys() into the Choice constructor (apply the same change to the other occurrence around the code marked at lines 673); update the expression where type=click.Choice(...) is used so it constructs the choices via unpacking rather than concatenation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tensorrt_llm/commands/serve.py`:
- Around line 770-796: The error strings list hardcoded supported model types
and can drift; update the two bad-parameter messages to generate the
supported-model list dynamically from the canonical mappings instead of
hardcoded text: for the tool parser error use the keys from
MODEL_TYPE_TO_TOOL_PARSER (or fall back to
list(ToolParserFactory.parsers.keys())) when building the message in the
resolve_auto_tool_parser failure branch (function/section around
resolve_auto_tool_parser and ToolParserFactory.parsers), and similarly for the
reasoning parser error use MODEL_TYPE_TO_REASONING_PARSER keys (or fall back to
list(ReasoningParserFactory.parsers.keys())) in the
resolve_auto_reasoning_parser failure branch (function/section around
resolve_auto_reasoning_parser and ReasoningParserFactory.parsers); keep the rest
of the exception text the same but replace the hardcoded lists with these
dynamically generated lists.
- Line 665: Replace the list concatenation used in the click.Choice call with
iterable unpacking: instead of building choices via ["auto"] +
list(ReasoningParserFactory.parsers.keys()), pass a choices iterable that
prefixes the literal "auto" and then unpacks
ReasoningParserFactory.parsers.keys() into the Choice constructor (apply the
same change to the other occurrence around the code marked at lines 673); update
the expression where type=click.Choice(...) is used so it constructs the choices
via unpacking rather than concatenation.
In `@tensorrt_llm/llmapi/reasoning_parser.py`:
- Around line 132-133: Open the config file with an explicit encoding by
changing the open call that uses config_path to open(config_path,
encoding="utf-8"), and wrap the json.load(f) call in a try/except that catches
json.JSONDecodeError (and optionally FileNotFoundError) to log or raise a
clearer error about a malformed or missing config; update the error handling
near the json.load usage so the exception message references config_path for
context and does not blanket-catch broad exceptions.
In `@tensorrt_llm/serve/tool_parser/tool_parser_factory.py`:
- Around line 29-39: Both resolve_auto_tool_parser and
resolve_auto_reasoning_parser duplicate the same config-loading logic; extract
that into a shared helper (e.g., get_model_type_from_config(model: str) ->
Optional[str]) that opens Path(model)/"config.json", returns None if missing or
on parse errors, and returns config.get("model_type",""); then update
resolve_auto_tool_parser and resolve_auto_reasoning_parser to call
get_model_type_from_config and use MODEL_TYPE_TO_TOOL_PARSER.get(model_type) (or
the equivalent lookup) so the file-IO and JSON parsing are centralized.
- Around line 35-36: The code that reads the JSON config using open(config_path)
and json.load lacks explicit encoding and JSON parsing error handling; update
the file-reading in the function that loads the tool parser (the block using
config_path and json.load) to open(config_path, encoding="utf-8") and wrap the
json.load call in a try/except catching json.JSONDecodeError, then raise or log
a clearer error (including config_path and the decode error) so callers (or
resolve_auto_reasoning_parser) get actionable feedback.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6b8d21a9-58e3-4d1a-a0d4-bb79fb2a7b0b
📒 Files selected for processing (3)
tensorrt_llm/commands/serve.pytensorrt_llm/llmapi/reasoning_parser.pytensorrt_llm/serve/tool_parser/tool_parser_factory.py
|
/bot run |
|
PR_Github #38561 [ run ] triggered by Bot. Commit: |
|
PR_Github #38561 [ run ] completed with state
|
Add automatic parser selection for --tool_parser and --reasoning_parser based on the model's HF config model_type.
Tool parser auto-detection mapping:
Reasoning parser auto-detection mapping:
For unrecognized models, a clear error is displayed listing supported model types and available parsers.
Made-with: Cursor
Summary by CodeRabbit
Description
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.