Skip to content

[Feature] 健壮的函数调用解析器 / Robust Function Call Parser #100

@SunYanbox

Description

@SunYanbox

现状 / Current State
现有 parse_func_call 仅支持 <func_call> 包裹的有限格式,且混有警告和废弃提示,无法适配大模型常见的各种错误调用格式(如 JSON 式、Python dict 式、Markdown 代码块、嵌套标签、属性/值对等)。
The current parse_func_call only supports a limited format wrapped in <func_call>, mixed with warnings and deprecation hints. It cannot adapt to various common malformed call formats from LLMs (e.g., JSON style, Python dict style, Markdown code blocks, nested tags, attribute/value pairs, etc.).

目标 / Goal
将参数提取抽离为独立系统,能够鲁棒地处理多种输入格式,提升大模型调用的成功率与可维护性。
Extract parameter parsing into an independent system that robustly handles multiple input formats, improving LLM call success rate and maintainability.

支持的格式 / Supported Formats (Proposed)

  1. 标准 XML 标签 / Standard XML Tags
    <func_name>get_weather</func_name><param name="city">北京</param>

  2. 属性式简写 / Attribute Shortcut
    <func_call name="get_weather"><param name="city" value="北京"/></func_call>

  3. JSON 对象 / JSON Object
    {"function": "get_weather", "arguments": {"city": "北京"}}

  4. Python dict 字面量 / Python Dict Literal
    get_weather(city='北京'){"city": "北京"} 附带在代码块中

  5. Markdown 代码块包裹 / Markdown Code Block Wrapped

    {"name": "get_weather", "params": {"city": "北京"}}
  6. 自由文本提取 / Free-Text Extraction
    从自然语言中猜测函数名和参数(作为最后的回退策略)

设计要点 / Design Highlights

  • 独立的解析器类 RobustCallParser,支持注册/移除解析策略 / Standalone RobustCallParser class with pluggable strategies
  • 解析策略按顺序尝试,失败即回退 / Strategies attempted in order, fallback on failure
  • 统一返回 (func_name, kwargs, warnings),不抛出解析器内部异常 / Unified return (func_name, kwargs, warnings), no internal parser exceptions thrown
  • 完整的警告收集机制,便于调试大模型输出 / Full warning collection for easier debugging of LLM outputs
  • 不破坏现有接口,可逐步替换内部实现 / No breaking change to existing interface, internal implementation can be gradually replaced

使用示例 / Usage Example

parser = RobustCallParser()
func_name, kwargs, warns = parser.parse(content)
if not func_name:
    func_name, kwargs, warns = parser.parse_with_fallback(content)

待解决问题 / Open Questions

  • 是否支持函数名与参数跨多行/换行混乱的情况? / Support function name and arguments across broken lines?
  • 冲突时(例如同时提供 JSON 和 XML)的优先级规则 / Priority rules when multiple formats conflict (e.g., both JSON and XML present)
  • 是否需要配置项允许用户自定义首选解析策略 / Need config to let users customize preferred parsing strategy?

实现阶段 / Implementation Phases

  1. 设计解析策略接口与注册机制 / Design parser strategy interface and registration
  2. 实现常见错误格式的解析器 / Implement parsers for common malformed formats
  3. 添加回退与组合解析逻辑 / Add fallback and compositional parsing logic
  4. 单元测试覆盖各种真实大模型输出 / Unit tests covering various real-world LLM outputs
  5. 平滑迁移原 parse_func_call 为对新系统的调用 / Smoothly migrate original parse_func_call to call the new system

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions