-
Notifications
You must be signed in to change notification settings - Fork 83
Open
Description
🚀 The feature, motivation and pitch
When using tool calling with LLMs, each tool invocation requires a separate model turn, leading to:
- High latency for multi-tool workflows
- Increased token consumption from repeated context
- Limited ability to perform complex logic (filtering, aggregation, conditionals) across tool results
xref:
- Anthropic's https://www.anthropic.com/engineering/advanced-tool-use
- Anthropic's https://www.anthropic.com/engineering/code-execution-with-mcp
- MCP Server with Code Execution https://github.com/harche/ProDisco
Proposed Feature: Programmatic Tool Calling (PTC) allows models to write Python code that orchestrates multiple tool calls within a secure sandbox. Instead of returning individual tool calls, the model generates code that:
- Calls multiple tools programmatically
- Processes results with Python logic (loops, filters, aggregations)
- Returns only the final result to the model context
Alternatives
Multiple sequential tool calls, current approach, works but incurs high latency and token cost for complex workflows
(copied from vllm-project/vllm#29573)
Metadata
Metadata
Assignees
Labels
No labels