test(e2e): stabilize Mistral required streaming#1483
Conversation
Signed-off-by: Chang Su <8605658+CatherineSue@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe streaming chat completions test now explicitly sets ChangesStreaming Test Configuration
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Code Review
This pull request updates the test_tool_choice_required_streaming test case in e2e_test/chat_completions/test_function_calling.py by adding a temperature parameter set to 0.1 to the chat completions request. This change likely improves the determinism of the test output. I have no feedback to provide as there were no review comments.
Signed-off-by: Chang Su <8605658+CatherineSue@users.noreply.github.com>
Description
Problem
After PR #1478 merged,
TestToolChoiceMistral.test_tool_choice_required_streamingstarted showing flakiness: required streaming sometimes produced no streamed tool-call chunks. The comparable non-streaming required test already pinstemperature=0.2, and the stricter required streaming arguments test pinstemperature=0.1; this smoke test was still using the backend default sampling temperature.Solution
Pin the Mistral required streaming smoke test to
temperature=0.1so the assertion exercises streaming tool-call plumbing instead of sampling variance. I also checked the Mistral e2e setup: it uses the model default chat template and--tool-call-parser mistral; no special chat template is injected. The Mistral path still uses the structural-tag constraint fromMistralParser::build_structural_tag.Changes
temperature=0.1fortest_tool_choice_required_streamingin the shared tool-choice e2e base.Test Plan
python3 -m py_compile e2e_test/chat_completions/test_function_calling.pyChecklist
cargo +nightly fmtpassescargo clippy --all-targets --all-features -- -D warningspassesSummary by CodeRabbit
Release Notes
This release contains internal test enhancements with no user-visible changes.