-
Notifications
You must be signed in to change notification settings - Fork 3k
Have tool call accuracy return a valid response, rather than throw exception, when response has no tool calls #40684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…s and to align with other prompt templates
…ception, when response has no tool calls
Thank you for your contribution @JoseCSantos! We will review the pull request and get back to you soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request modifies the tool call accuracy evaluator to return an empty list instead of raising an exception when no tool calls are present and to adjust the aggregation logic for cases with no tool calls.
- Returns empty input rather than throwing an exception when response has no tool calls.
- Updates the aggregation logic to return a NaN score and a clear result message when no per-turn results exist.
Files not reviewed (1)
- sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_intent_resolution/intent_resolution.prompty: Language not supported
# return empty input when there are no tool calls. From a user perspective this is preferrable to raising an exception | ||
# as the user will see explicitly the evaluator did not run, rather than seeing a null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo detected: 'preferrable' should be 'preferable'.
# return empty input when there are no tool calls. From a user perspective this is preferrable to raising an exception | |
# as the user will see explicitly the evaluator did not run, rather than seeing a null | |
# return empty input when there are no tool calls. From a user perspective this is preferable to raising an exception |
Copilot is powered by AI, so mistakes are possible. Review output carefully before use.
API change check API changes are not detected in this pull request. |
Description
Please add an informative description that covers that changes made by the pull request and link all relevant issues.
If an SDK is being regenerated based on a new swagger spec, a link to the pull request containing these swagger spec changes has been included above.
All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines