Skip to content

feat: prompt-based tool calling for non-DeepSeek models via custom AP…#205

Open
liuliyucomputer wants to merge 1 commit into
usewhale:mainfrom
liuliyucomputer:feat/prompt-based-tool-calling
Open

feat: prompt-based tool calling for non-DeepSeek models via custom AP…#205
liuliyucomputer wants to merge 1 commit into
usewhale:mainfrom
liuliyucomputer:feat/prompt-based-tool-calling

Conversation

@liuliyucomputer
Copy link
Copy Markdown

Summary

This PR adds prompt-based tool calling support for non-DeepSeek models (e.g., GPT series) connected through custom API endpoints or relay services. When a model does not support the native tools parameter, Whale automatically switches to instructing the model to output tool calls as JSON code blocks, then parses them locally.

Motivation

Currently, Whale is DeepSeek-only. While custom API endpoints are configurable via [api].base_url, non-DeepSeek models (like those behind API relays such as tokenshengsheng.com) fail to call tools because they don't support the DeepSeek-specific tool call protocol. This PR bridges that gap, allowing Whale to work with any /chat/completions-compatible endpoint regardless of native tool support.

Changes

internal/llm/deepseek/client.go (core changes)

  • buildToolSystemPrompt() — Generates a standalone system message instructing the model to output tool calls as JSON code blocks ({"tool_call": {"name": "...", "arguments": {...}}}). This message is prepended as the FIRST system message for maximum priority.

  • parseTextToolCalls() — Three-tier JSON parser that extracts tool calls from model text output:

    1. Standard format: {"tool_call": {"name": "...", "arguments": {...}}} (inside ```json blocks or inline)
    2. Alternative formats: {"command": "..."}, {"name": "...", "arguments": {...}}, {"tool": "...", "args": {...}}
    3. Smart extraction (extractAnyToolJSON): Character-by-character bracket-depth-tracking fallback parser
  • injectToolPrompt() — Injects tool instructions as a standalone first system message, giving it higher priority than the agent's conversational system prompt.

  • toTextToolMessages() — Formats conversation history for non-native-tool models, converting tool calls/results into natural language user messages to preserve full conversational context.

internal/defaults/defaults.go

  • IsSupportedModel() now returns true for any custom model name (not just the hardcoded DeepSeek list), enabling custom endpoint usage.

Documentation

  • README.zh.md — Added API relay configuration example and prompt-based tool calling section
  • docs/configuration.md — Added "Custom API Endpoints and Non-DeepSeek Models" chapter

How It Works

…I endpoints

- Add buildToolSystemPrompt() to generate standalone tool instruction as first system message
- Add three-tier JSON parser (standard/alt/smart extraction) for model text output
- Add toTextToolMessages() to preserve conversation context for non-native-tool models
- Allow custom model names in IsSupportedModel() for API relay endpoints
- Document API relay configuration and prompt-based tool calling in README and docs
@liuliyucomputer
Copy link
Copy Markdown
Author

我的邮箱1605010842@qq.com

@shayne-snap
Copy link
Copy Markdown
Contributor

Hi @liuliyucomputer. Thanks for the PR.

For now, Whale will remain DeepSeek-only.

This is an intentional product and architecture boundary, mainly because it is optimized around DeepSeek’s cache behavior, native tool calling,reasoning controls, and cost model. See the README section here

For third-party or unofficial DeepSeek-compatible endpoints, Whale already supports custom /chat/completions base URLs. The documented entry point is here.

So I do not think we should add prompt-based tool calling for non-DeepSeek models right now. If there are compatibility issues with DeepSeek-compatible endpoints, we should handle those within the existing custom endpoint path instead of turning Whale into a generic multi-model wrapper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants