[Feature Request] Support unordered list of function calls for `expectedCall`

Currently, `expectedCall` appears to be optimized for single-tool call validation. In more complex scenarios (like agents that trigger multiple parallel tool calls or tasks that can be solved using different tools in any order), a strict one-to-one or ordered comparison causes evaluation failures even when the model's logic is correct.

Example: If a prompt is: "Check the weather in London and Paris", the model might call get_weather(city='London') then get_weather(city='Paris'), or vice versa. Both should be considered a "Pass."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Support unordered list of function calls for `expectedCall` #38

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Support unordered list of function calls for expectedCall #38

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Feature Request] Support unordered list of function calls for `expectedCall` #38