Skip to content

Add ai-investigate shark-cli command for AI-driven leak investigation#2796

Open
pyricau wants to merge 4 commits intomainfrom
py/skill
Open

Add ai-investigate shark-cli command for AI-driven leak investigation#2796
pyricau wants to merge 4 commits intomainfrom
py/skill

Conversation

@pyricau
Copy link
Copy Markdown
Member

@pyricau pyricau commented Feb 23, 2026

Summary

  • Adds a new ai-investigate command to shark-cli designed to be driven by an AI agent exploring Android heap dumps
  • Introduces LeakInvestigationSession to split the expensive BFS path-finding (runs once) from the cheap inspection+bisecting step (re-runs on each status override)
  • Exposes an interactive keyword DSL with 15 commands: trace, node N, fields, instances, string, referrers, mark-leaking, mark-not-leaking, mark-unknown, select-group, select-trace, summary, help, exit
  • Output is always JSON — designed exclusively for agent consumption, not human use
  • Full investigation instructions are printed at session startup (agents read stdout, not --help)

Key design

Split pipeline: BFS path-finding runs exactly once per session via RealLeakTracerFactory.findLeaksWithCachedPaths(). Re-inspection calls reinspectPath() which skips BFS entirely and only re-runs the cheap ObjectInspector + bisecting stages. This makes override round-trips instantaneous.

Override precedence: Status overrides are injected as an ObjectInspector appended last in the inspector list. It clears the opposing reason set before adding its own reason, so overrides always win over standard inspectors (e.g. AndroidObjectInspectors) without any special-casing.

WIN detection: leakFound: true in the JSON when no UNKNOWN nodes remain. culpritReferenceIndex points to the bad reference — the crossing from the last NOT_LEAKING node to the first LEAKING node.

Agent workflow (printed at startup):

  1. Run trace — scan nodes[] for "leakingStatus": "UNKNOWN"
  2. Bisect the UNKNOWN window — pick the midpoint node
  3. Read class source (local repo or cs.android.com) + run fields N for runtime evidence
  4. Mark the node (mark-leaking / mark-not-leaking) with a concrete reason citing field values
  5. Repeat until leakFound: true

JSON schema (trace commands)

{
  "group": 0,
  "traceInstance": 0,
  "overrides": 2,
  "suspectWindowStart": 3,
  "leakFound": false,
  "culpritReferenceIndex": -1,
  "nodes": [
    {
      "index": 0,
      "objectId": 12345,
      "className": "com.example.MyActivity",
      "leakingStatus": "NOT_LEAKING",
      "leakingStatusReason": "Activity#mDestroyed is false",
      "isSuspect": false
    }
  ]
}

Test plan

  • ./gradlew :shark:shark:compileKotlin :shark:shark-cli:compileKotlin passes
  • ./gradlew :shark:shark:test --tests shark.LeakInvestigationSessionTest — 11 tests pass
  • Detekt: 0 code smells
  • Manual smoke test with a real .hprof file
  • Override round-trip: mark-not-leaking 0 "test" → verify trace updates; mark-unknown 0 → verify revert
  • Verify leakFound flips to true and culpritReferenceIndex is set after all UNKNOWNs are resolved

🤖 Generated with Claude Code

@pyricau pyricau force-pushed the py/skill branch 2 times, most recently from 0f9e8a1 to 26391f4 Compare February 26, 2026 14:07
@pyricau pyricau marked this pull request as ready for review February 26, 2026 14:19
shark-cli gains a new `ai-investigate` command backed by a FIFO daemon
architecture. Run `shark-ai-investigate --hprof <file>` to load a heap
dump and receive a session shortcode; the wrapper then starts the daemon
in the background. Commands are sent via `ai-investigate-cmd <shortcode>
<command>` over named pipes. The daemon embeds a full algorithm guide
in its `--help` output so an AI agent can drive the investigation
autonomously.

Daemon commands:
- `trace`: leak traces as structured JSON (with `key`, `firstLeakingObjectId`)
- `fields` / `string` / `array`: inspect heap objects
- `mark-leaking` / `mark-not-leaking`: supply leaking-status context
- `retained-size @<id>`: memory retained by an object
- `human-readable-trace`: plain-text leak summary
- `ping` / `close-session`

New in shark:
- `SingleObjectRetainedSizeCalculator`: retained size via exclude-and-reach
  two-BFS algorithm (provably correct for any reference graph)
- `HeapClass.readInstanceFieldCount()`: reads instance field count as a
  single unsigned short without allocating field records
- `ShallowSizeCalculator` fixes: array objects now include the ART object
  header omitted from HPROF records; class size uses static field values
  plus ArtField metadata instead of the raw HPROF record size

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add timestamped log entries for startup, every command, and notes
- Startup line calls out the daemon starting with the heap dump path
- Add `note TEXT` command to append free-form observations/hypotheses
  to the session log; update instructions to require notes continuously
- Add required `reason` parameter to every command without one; daemon
  rejects commands missing a reason
- Extract heap dump timestamp and metadata (AndroidMetadataExtractor)
  during analysis and include them in `trace` and `human-readable-trace`
  responses, matching the HeapAnalysisSuccess format
- Move PrettyPrintJson to its own file with unit tests; streaming
  pretty-printer with no parse/re-encode round trip
- Update AiInvestigateDaemonTest to pass reasons in all commands

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@tcmulcahy tcmulcahy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some important feedback inline, but it's all optional, so I'm approving.


## Unreleased

`shark-cli` gains a new **AI-driven leak investigation skill**: run `shark-ai-investigate --hprof <file>` to load a heap dump and start a session. A shell wrapper prints a session shortcode, then starts the `ai-investigate` daemon in the background. An AI agent (or a human) can then send commands to the daemon via `ai-investigate-cmd <shortcode> <command>` — all over named pipes. Commands include `trace` (leak traces as structured JSON), `fields` / `string` / `array` (inspect objects), `mark-leaking` / `mark-not-leaking` (supply context), `retained-size` (memory retained by an object), and `human-readable-trace` (plain-text summary). The skill embeds a full algorithm guide in its `--help` output so an AI agent can drive the investigation autonomously.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this design, but I think the name ai-investigate is misleading, even if it is how this tool is likely to be used. I'd suggest shark-daemon start --hprof <file> and shark-daemon stop --shortcode ...

done

# --no-help is consumed by this wrapper and must not be forwarded to shark-cli.
# Rebuild $@ without it using a POSIX-safe eval loop.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need POSIX compat? Ancient bash (like the version that ships with Mac) supports arrays, so you can write much simpler code:

new_args=()
for arg in "$@"; do
  [[ $arg == --no-help ]] || new_args+=("$arg")
done
set -- "${new_args[@]}"

# (e.g. a task runner) would block until the daemon exits, because the daemon
# holds the write end of the pipe open. The daemon communicates via named
# pipes, not stdout, so nothing useful is lost.
"$SHARK_CLI" "$@" ai-investigate --session "$shortcode" >/dev/null 2>&1 &
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the daemon will still be tied to the session - if you close the window within which you ran this, the daemon will die too. You can fix this with nohup if desired.

# pipes, not stdout, so nothing useful is lost.
"$SHARK_CLI" "$@" ai-investigate --session "$shortcode" >/dev/null 2>&1 &

# Wait until the daemon creates its named pipes.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any code to capture the pid of the daemon - how to shut it down after?


// Create named pipes after analysis so the wrapper's pipe-existence check
// doubles as a signal that the heap is loaded and the daemon is ready.
ProcessBuilder("mkfifo", inPath, outPath).start().waitFor()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach has some drawbacks. The main one is that if more than one client attempts to connect to the same daemon, things will be totally broken in a way that's really hard to debug. Also, FIFOs are just annoying (e.g. you have to be careful to avoid deadlocking)

For this reason, I think a Unix Domain Socket approach would be better. This allows traditional networking patterns, but without the security implications of an actual udp/tcp server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants