Skip to content

feat: Add warnings for invalid tool_choice and UTs #6582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 27, 2025
Merged

Conversation

CatherineSue
Copy link
Collaborator

@CatherineSue CatherineSue commented May 25, 2025

Motivation

This PR improves error handling and user experience for invalid tool choices in SGLang's function calling system. Previously, when users specified a tool_choice with a function name that doesn't exist in the available tools, the system would silently fail with Skip invalid EBNF strings. This change adds proper validation and logging.
NOTE: the correct way is to add this kind of validation in the ChatCompletionRequest protocol and returns a 400. However, our adapter.py or protocol.py doesn't have this kind of check. And it is better to fix all similar issues in refactor. So right now we use a logging.warning to mitigate it.

The changes also introduce comprehensive unit tests for tool choice functionality across different models and scenarios, ensuring robust validation of the function calling system.

Modifications

  • Add validation in get_ebnf() method to check if the requested function exists in available tools
  • Add test_tool_choice.py to run_suite:
    • tool_choice="auto" in streaming and non-streaming modes
    • tool_choice="required" in streaming and non-streaming modes
    • Specific function selection via tool_choice={"type": "function", "function": {"name": "..."}}
    • Multi-tool scenarios
    • Error handling for invalid function names
    • Support for multiple model types (Llama, Qwen2.5, Mistral), which covers (Llama32Detector, MistralDetector, and Qwen25Detector)

Checklist

- Add checks for invalid tool_choice when tool_choice is
ToolChoice, e.g. function_name is not in tools
- Add test_tool_choice.py to run_suite
@CatherineSue
Copy link
Collaborator Author

UT result:
python3 -m unittest test_tool_choice.TestToolChoice

Screenshot 2025-05-24 at 10 24 01 PM

- Replace class-level FLAKY_TESTS with setUpClass configuration
 for better inheritance handling
- Remove warning system in favor of explicit flaky test marking
- Mark `test_multi_tool_scenario_auto` and
`test_multi_tool_scenario_required` as flaky
- Increase max_tokens from 200 to 400 for better tool calling
performance
- Update `test_tool_choice_specific_function_non_streaming` to
handle multiple weather calls for top 5 cities
@CatherineSue
Copy link
Collaborator Author

Updated UT with flaky tests instead of print warnings
Screenshot 2025-05-25 at 11 16 44 AM

@zhyncs zhyncs self-assigned this May 26, 2025
Copy link
Collaborator

@JustinTong0323 JustinTong0323 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Member

@zhyncs zhyncs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verified locally

@zhyncs zhyncs merged commit 41ba767 into main May 27, 2025
22 of 39 checks passed
@zhyncs zhyncs deleted the chang/tool-call-tests branch May 27, 2025 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants