-
Notifications
You must be signed in to change notification settings - Fork 822
Open
Labels
enhancementNew feature or requestNew feature or requestfrontend`python -m dynamo.frontend` and `dynamo-run in=http|text|grpc``python -m dynamo.frontend` and `dynamo-run in=http|text|grpc`
Description
Summary
Add support for GLM-4.7 model's tool calling and reasoning parsing in the Rust frontend component.
Background
GLM-4.7 (zai-org/GLM-4.7) requires specific parsers for tool calling and reasoning that are not currently implemented in Dynamo's Rust frontend. vLLM and SGLang support this model with --tool-call-parser glm47 and --reasoning-parser glm45.
Tool Call Parser (glm47)
Format (from tokenizer_config.json):
<tool_call>function_name
<arg_key>param1</arg_key>
<arg_value>value1</arg_value>
<arg_key>param2</arg_key>
<arg_value>value2</arg_value>
</tool_call>
Special tokens:
<tool_call>/</tool_call>- Tool call boundaries<arg_key>/</arg_key>- Argument key wrapper<arg_value>/</arg_value>- Argument value wrapper<tool_response>/</tool_response>- Tool response wrapper (input side)
Reasoning Parser (glm45)
Format (from chat_template.jinja):
<|assistant|><think>reasoning content</think>actual response
Uses standard <think>...</think> tags for reasoning blocks.
References
- HuggingFace Model: https://huggingface.co/zai-org/GLM-4.7
- Chat Template: https://huggingface.co/zai-org/GLM-4.7/blob/main/chat_template.jinja
- vLLM Tool Parser PR: GLM-4.7 Tool Parser and Doc Update vllm-project/vllm#30876
- vLLM GLM Guide: https://docs.vllm.ai/projects/recipes/en/latest/GLM/GLM.html
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestfrontend`python -m dynamo.frontend` and `dynamo-run in=http|text|grpc``python -m dynamo.frontend` and `dynamo-run in=http|text|grpc`