-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Need to fix some example evals like : FAILED examples/triage_agent/evals.py::test_conversation_is_successful[messages0] - assert False == True or this one:
___________________________________________________________________________________________________ test_does_not_call_weather_when_not_asked[Hi!] ____________________________________________________________________________________________________
query = 'Hi!'
@pytest.mark.parametrize(
"query",
[
"Who's the president of the United States?",
"What is the time right now?",
"Hi!",
],
)
def test_does_not_call_weather_when_not_asked(query):
tool_calls = run_and_get_tool_calls(weather_agent, query)
> assert not tool_calls
E assert not [{'function': {'arguments': '{"location": "New York", "time": "now"}', 'name': 'get_weather'}, 'id': 'call_0', 'type': 'function'}]
examples/weather_agent/evals.py:44: AssertionError
==== short test summary info =====
FAILED examples/weather_agent/evals.py::test_does_not_call_weather_when_not_asked[Who's the president of the United States?] - assert not [{'function': {'arguments': '{"location": "United States"}', 'name': 'get_weather'}, 'id': 'call_0', 'type': 'function'}]
FAILED examples/weather_agent/evals.py::test_does_not_call_weather_when_not_asked[What is the time right now?] - assert not [{'function': {'arguments': '{"location": "", "time": "now"}', 'name': 'get_weather'}, 'id': 'call_0', 'type': 'function'}]
FAILED examples/weather_agent/evals.py::test_does_not_call_weather_when_not_asked[Hi!] - assert not [{'function': {'arguments': '{"location": "New York", "time": "now"}', 'name': 'get_weather'}, 'id': 'call_0', 'type': 'function'}]
====== 3 failed, 3 passed in 3.10s ======
Metadata
Metadata
Assignees
Labels
No labels
