feat: prompt-based tool calling for non-DeepSeek models via custom AP…#205
Open
liuliyucomputer wants to merge 1 commit into
Open
feat: prompt-based tool calling for non-DeepSeek models via custom AP…#205liuliyucomputer wants to merge 1 commit into
liuliyucomputer wants to merge 1 commit into
Conversation
…I endpoints - Add buildToolSystemPrompt() to generate standalone tool instruction as first system message - Add three-tier JSON parser (standard/alt/smart extraction) for model text output - Add toTextToolMessages() to preserve conversation context for non-native-tool models - Allow custom model names in IsSupportedModel() for API relay endpoints - Document API relay configuration and prompt-based tool calling in README and docs
Author
Contributor
|
Hi @liuliyucomputer. Thanks for the PR. For now, Whale will remain DeepSeek-only. This is an intentional product and architecture boundary, mainly because it is optimized around DeepSeek’s cache behavior, native tool calling,reasoning controls, and cost model. See the README section here For third-party or unofficial DeepSeek-compatible endpoints, Whale already supports custom So I do not think we should add prompt-based tool calling for non-DeepSeek models right now. If there are compatibility issues with DeepSeek-compatible endpoints, we should handle those within the existing custom endpoint path instead of turning Whale into a generic multi-model wrapper. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds prompt-based tool calling support for non-DeepSeek models (e.g., GPT series) connected through custom API endpoints or relay services. When a model does not support the native
toolsparameter, Whale automatically switches to instructing the model to output tool calls as JSON code blocks, then parses them locally.Motivation
Currently, Whale is DeepSeek-only. While custom API endpoints are configurable via
[api].base_url, non-DeepSeek models (like those behind API relays such as tokenshengsheng.com) fail to call tools because they don't support the DeepSeek-specific tool call protocol. This PR bridges that gap, allowing Whale to work with any/chat/completions-compatible endpoint regardless of native tool support.Changes
internal/llm/deepseek/client.go(core changes)buildToolSystemPrompt()— Generates a standalone system message instructing the model to output tool calls as JSON code blocks ({"tool_call": {"name": "...", "arguments": {...}}}). This message is prepended as the FIRST system message for maximum priority.parseTextToolCalls()— Three-tier JSON parser that extracts tool calls from model text output:{"tool_call": {"name": "...", "arguments": {...}}}(inside ```json blocks or inline){"command": "..."},{"name": "...", "arguments": {...}},{"tool": "...", "args": {...}}extractAnyToolJSON): Character-by-character bracket-depth-tracking fallback parserinjectToolPrompt()— Injects tool instructions as a standalone first system message, giving it higher priority than the agent's conversational system prompt.toTextToolMessages()— Formats conversation history for non-native-tool models, converting tool calls/results into natural language user messages to preserve full conversational context.internal/defaults/defaults.goIsSupportedModel()now returnstruefor any custom model name (not just the hardcoded DeepSeek list), enabling custom endpoint usage.Documentation
README.zh.md— Added API relay configuration example and prompt-based tool calling sectiondocs/configuration.md— Added "Custom API Endpoints and Non-DeepSeek Models" chapterHow It Works