[Feature] 支持动态配置工具调用格式提示词 / Support dynamic configuration of tool call format prompts

**Title / 标题**  
[Feature] 支持动态配置工具调用格式提示词，以测试模型在不同提示下的工具调用能力  
[Feature] Support dynamic configuration of tool call format prompts to test model’s tool-calling ability under different prompt variations  

**Description / 描述**  
Currently, the tool call format prompt for LLMs is hardcoded. To systematically evaluate how different prompt formulations affect a model’s ability to correctly invoke tools, we need a system that allows dynamic prompt configuration. This feature will support testing various tool-calling prompt templates without code changes.  
当前大模型工具调用的格式提示词是硬编码的。为了系统性地评估不同提示词写法对模型正确调用工具能力的影响，需要一个支持动态配置提示词的系统。该功能将支持在不修改代码的情况下测试不同的工具调用提示模板。  

**Motivation / 动机**  
Different prompt structures (e.g., XML-style, JSON-style, natural language) can significantly impact whether a model correctly calls tools with required parameters. A configurable approach enables efficient A/B testing, prompt engineering, and regression testing across model versions.  
不同的提示词结构（如 XML 风格、JSON 风格、自然语言）会显著影响模型是否能正确调用工具并携带所需参数。可配置的方式能够支持高效的 A/B 测试、提示词工程以及跨模型版本的回归测试。  

**Proposed Solution / 建议方案**  
- Add a configuration interface to input/select tool call format prompts as a string template.  
- Support placeholders like `{{tools}}`, `{{tool_name}}`, `{{parameters}}` for dynamic content injection.  
- The system will inject the configured prompt into the model’s system/user message before inference.  
- Store prompt templates in configuration files or a simple admin UI for non‑code changes.  
- 增加一个配置接口，用于输入/选择工具调用格式提示词（字符串模板）。  
- 支持占位符如 `{{tools}}`、`{{tool_name}}`、`{{parameters}}` 用于动态内容注入。  
- 系统在推理前会将配置好的提示词注入到模型的 system/user 消息中。  
- 将提示词模板保存在配置文件或一个简单的管理界面中，无需改代码。  

**Acceptance Criteria / 验收标准**  
- The system reads tool‑call prompt template from a configurable source (e.g., env, JSON, DB).  
- Changing the template dynamically changes model tool‑calling behavior without restart.  
- Provide at least two example templates (e.g., structured JSON, plain instruction) for testing.  
- Logs show exactly which prompt template was used for each call.  
- 系统从可配置来源（如环境变量、JSON、数据库）读取工具调用提示词模板。  
- 动态修改模板后，无需重启即可改变模型的工具调用行为。  
- 提供至少两个示例模板（如结构化 JSON、纯文本指令）用于测试。  
- 日志中明确记录每次调用使用了哪个提示词模板。  

**Testing Scope / 测试范围**  
- Verify correct tool invocation (name + parameters) under different prompt templates.  
- Check edge cases: malformed template, empty template, missing placeholders.  
- 验证在不同提示词模板下工具调用（名称+参数）的正确性。  
- 检查边界情况：格式错误的模板、空模板、缺少占位符。  

**Additional Context / 补充说明**  
This is a foundational feature for automated evaluation pipelines. Not a bug – the current hardcoded prompts work but lack extensibility for experimentation.  
这是自动化评估流水线的基础功能。不是缺陷——当前硬编码的提示词可以工作，但缺少用于实验的可扩展性。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] 支持动态配置工具调用格式提示词 / Support dynamic configuration of tool call format prompts #102

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feature] 支持动态配置工具调用格式提示词 / Support dynamic configuration of tool call format prompts #102

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions