fix(agents): early dedup for repeated mutation tool calls#1557
Conversation
Review: early dedup for repeated mutation tool callsSummarySolid, well-scoped fix that brings mutation tools to parity with the existing query-dedup signal. The core insight — key mutations on The single most important thing: this touches the base agent loop, which CLAUDE.md classifies as LLM-affecting. The PR's own test plan lists Issues Found🟡 Eval run is a hard merge gate, not optional ( 🟡 (Confirmed there is no 🟢 Verify provider tolerance for the user-message-between-tool-results ordering ( 🟢 Domain tool names hardcoded in the base class ( Strengths
VerdictApprove with suggestions — the code is sound and the tests are strong. No code-level blocker, but treat the |
itomek
left a comment
There was a problem hiding this comment.
Additive advisory signal keyed on (tool, args); mutation still always executes. LGTM.
Generated by Claude Code
The base agent caught repeated query tool calls at the first repeat but had no equivalent for mutation tools. Before: a small model that re-issued an identical mutation (e.g.
mark_readon an already-read id) burned ~4 planning steps before the reactive loop-detector halted it, and could drop the last item in a mutation sequence. After: an identical mutation re-issue is caught at the first repeat and the agent gets a corrective re-plan signal, at parity with query dedup. The cache is keyed on(tool, normalized args)so mutations on different ids are never suppressed, and the signal is injected after execution — no call is silently dropped.Closes #1317.
Test plan
pytest tests/unit/agents/test_mutation_dedup.py— 6 tests (first-repeat detection, distinct-ids not suppressed, arg-order normalization, batch variants, non-mutation no-op, unhashable-args fallback)python util/lint.py --allgaia eval agenton the relevant category and--compareagainst the committed baseline; confirm no regression (acceptance criterion in fix(agents): early dedup for repeated mutation tool calls #1317; not optional for base-agent changes)