Skip to content

tool-call proactive hints: detect throwaway heredoc waste#69

Merged
dpetrou-continua merged 1 commit into
mainfrom
dpetrou/tool-call-hints
Feb 22, 2026
Merged

tool-call proactive hints: detect throwaway heredoc waste#69
dpetrou-continua merged 1 commit into
mainfrom
dpetrou/tool-call-hints

Conversation

@dpetrou-continua
Copy link
Copy Markdown
Contributor

Summary

New "proactive hint" channel: detect when the agent writes throwaway heredoc scripts for operations that have existing repo tools.

The problem (mined from 300 real Pi sessions)

  • 9,012 inline Python heredoc scripts across 300 sessions
  • ~220K throwaway lines of code, 2.3M wasted tokens ($35 at codex rates)
  • 55% of waste is Linear API + GCloud heredocs — the same 2-3 operations repeated 4000+ times
  • Example: agent writes a 492-line urllib script to create a Linear document and comment, when linear_consolidation.py exists

The fix

Detect reinvention patterns on tool_call and inject "better alternative" hints on tool_result:

  • Linear API query → pants run scripts:linear_consolidation -- dump
  • Linear API mutation → save script or extend linear_consolidation.py
  • GCloud logging → pants run sophon/scripts:triage_prod_logs
  • JSON extraction → use jq
  • Subprocess wrappers → run shell command directly

Waste taxonomy (from session mining)

Category Count Tokens %
GCloud heredocs 2,264 697K 30%
Linear API heredocs 1,632 579K 25%
Other heredocs 3,787 751K 32%
JSON one-liners 775 72K 3%
Subprocess wrappers 529 182K 8%
Verbose comments 445 52K 2%

Testing

28 test files, 164 tests pass. New test file: tests/tool-call-hints.test.ts.

Session mining found 9K+ throwaway inline scripts (~2.3M wasted tokens)
across 300 real sessions. Top waste: Linear API heredocs (1,632x),
GCloud heredocs (2,264x), JSON processing (775x).

New intervention: detect when agent writes a heredoc for an operation
that has an existing repo tool, and append a "better alternative" hint
to the tool_result. Fires once per hint ID per session (deduped).

Patterns: linear query/mutation, gcloud logging/deploy, json->jq,
subprocess->direct shell.

Part of the waste taxonomy:
- 55% of wasted tokens are Linear + GCloud heredocs
- Same 2-3 operations repeated 4000+ times across sessions
- A single reusable tool/extension for each would eliminate most waste
@dpetrou-continua dpetrou-continua merged commit 6f88ca9 into main Feb 22, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant