Fix TypeError: unhashable type in ToolCorrectnessMetric for unhashable tool outputs (#2815) by HarperZ9 · Pull Request #2822 · confident-ai/deepeval

HarperZ9 · 2026-06-30T03:11:08Z

Summary

Fixes #2815. ToolCorrectnessMetric raises TypeError: unhashable type during component-level evaluation (evals_iterator()) when a tool call's output, or a value nested in its input_parameters, is an arbitrary unhashable object, e.g. a LangChain ToolMessage or a pydantic model that defines __eq__ without __hash__.

Root cause

ToolCorrectnessMetric._calculate_non_exact_match_score puts ToolCall instances into a set, which calls ToolCall.__hash__. That routes input_parameters and output through _make_hashable (deepeval/test_case/llm_test_case.py). Its final branch returned any non-collection object unchanged, assuming it was a primitive hashable type:

else:
    # For primitive hashable types (str, int, float, bool, etc.)
    return obj

When obj is unhashable, hash((self.name, input_params_hashable, output_hashable)) then raises TypeError.

Fix

Fall back to a stable repr() for any value that is not hashable. Primitives and already-hashable objects are returned unchanged, so existing behavior is preserved; only the previously-crashing path changes.

else:
    try:
        hash(obj)
    except TypeError:
        return repr(obj)
    return obj

Reproduction (before the fix)

from deepeval.test_case import ToolCall

class ToolMessage:          # any object that is unhashable
    __hash__ = None

hash(ToolCall(name="t", output=ToolMessage()))           # TypeError: unhashable type
hash(ToolCall(name="t", input_parameters={"x": ToolMessage()}))  # TypeError

Both return an int after the fix and the tool calls can be placed in a set.

Tests

Adds test_tool_call_hashing_with_unhashable_types in tests/test_core/test_test_case/test_single_turn.py, covering an unhashable object both as output and nested in input_parameters. No API key required.

$ pytest tests/test_core/test_test_case/test_single_turn.py -q
37 passed

black --check and the existing hashing tests are clean.

Disclosure: authored with AI assistance (Claude), reviewed and verified before submitting.

…nfident-ai#2815) ToolCorrectnessMetric places ToolCall instances into a set during component-level evaluation. ToolCall.__hash__ routes input_parameters and output through _make_hashable, whose final branch returned arbitrary objects unchanged, assuming they were primitive hashable types. When the output (or a value nested in input_parameters) is an unhashable object, such as a LangChain ToolMessage or a pydantic model that defines __eq__ without __hash__, hashing raised `TypeError: unhashable type`, breaking evals_iterator() on agent traces. Fall back to a stable repr() for any value that is not hashable. Primitives and already-hashable objects are returned unchanged. Adds a regression test covering an unhashable object both as the output and nested in input_parameters. Closes confident-ai#2815 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

vercel · 2026-06-30T03:11:11Z

Someone is attempting to deploy a commit to the Confident AI Team on Vercel.

A member of the Team first needs to authorize it.

HarperZ9 · 2026-06-30T03:21:33Z

A quick note on the red checks, so the CI status is not misleading:

test and all integration jobs (langchain, openai, crewai, llama-index, agentcore, strands, pydantic-ai, openai-agents) pass. The new regression test runs under test.
lint (black) fails on pre-existing drift in main, not on this PR. The job reports "47 files would be reformatted"; none of them are the two files this PR touches (deepeval/test_case/llm_test_case.py and tests/test_core/test_test_case/test_single_turn.py), both of which black leaves unchanged. The black.yml workflow has been failing on main since around June 10, independent of this change. I left the 47 unrelated files alone to keep this PR scoped to the fix.
Vercel is the fork-deploy auth check, not a code check.

Happy to rebase or adjust if you handle the black drift separately.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix TypeError: unhashable type in ToolCorrectnessMetric for unhashable tool outputs (#2815)#2822

Fix TypeError: unhashable type in ToolCorrectnessMetric for unhashable tool outputs (#2815)#2822
HarperZ9 wants to merge 1 commit into
confident-ai:mainfrom
HarperZ9:fix/toolcall-unhashable-2815

HarperZ9 commented Jun 30, 2026

Uh oh!

vercel Bot commented Jun 30, 2026

Uh oh!

HarperZ9 commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

HarperZ9 commented Jun 30, 2026

Summary

Root cause

Fix

Reproduction (before the fix)

Tests

Uh oh!

vercel Bot commented Jun 30, 2026

Uh oh!

HarperZ9 commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant