|
| 1 | +# llmfix |
| 2 | + |
| 3 | +Self-healing for integration test failures. Runs Claude agents in parallel against failed Go packages, commits fixes in isolated git worktrees, and cherry-picks them back onto the caller's branch. |
| 4 | + |
| 5 | +## Components |
| 6 | + |
| 7 | +| File | Purpose | |
| 8 | +|---|---| |
| 9 | +| `manager.go` | Dispatches and supervises concurrent fix pipelines. Serializes cherry-picks. Persists agent status to `agents-status.json`. Recovers leftover worktrees. | |
| 10 | +| `operator.go` | Runs the triage → fix pipeline for one package. | |
| 11 | +| `claude.go` | Embeds prompts (`triage.md`, `fix.md`) and types the `claude -p` JSON envelope. | |
| 12 | +| `git.go` | Worktree creation/cleanup and cherry-pick helpers. | |
| 13 | +| `triage.md` | Prompt for the triage agent (model: `sonnet`). Classifies failures and syncs Jira CON-381 subtasks. | |
| 14 | +| `fix.md` | Prompt for the fix agent (model: `opus`). Edits code, validates with `go test` + `golangci-lint`, and commits one issue per commit. | |
| 15 | + |
| 16 | +## Flow |
| 17 | + |
| 18 | +```mermaid |
| 19 | +sequenceDiagram |
| 20 | + participant Caller |
| 21 | + participant Mgr as Manager |
| 22 | + participant Op as Operator |
| 23 | + participant Claude |
| 24 | + participant Jira |
| 25 | +
|
| 26 | + Caller->>Mgr: Dispatch(failed package) |
| 27 | + Mgr->>Op: run in worktree (pinned to baseSHA) |
| 28 | +
|
| 29 | + Op->>Claude: triage (sonnet) |
| 30 | + Claude->>Jira: search / create CON-381 subtasks |
| 31 | + Jira-->>Claude: issue keys |
| 32 | + Claude-->>Op: classified issues |
| 33 | +
|
| 34 | + Op->>Claude: fix (opus, in worktree) |
| 35 | + Claude-->>Op: one commit per issue |
| 36 | +
|
| 37 | + Op-->>Mgr: done |
| 38 | + Mgr->>Caller: cherry-pick commits onto HEAD<br/>(mutex-serialized) |
| 39 | +``` |
| 40 | + |
| 41 | +## Key invariants |
| 42 | + |
| 43 | +- **Worktrees are pinned to `baseSHA`** so concurrent cherry-picks advancing `HEAD` don't pollute new worktrees. |
| 44 | +- **Cherry-picks are mutex-serialized** across all goroutines — only one touches the caller's `HEAD` at a time. |
| 45 | +- **One commit per issue.** The fix agent is instructed never to combine fixes. |
| 46 | +- **Status is persisted** to `agents-status.json` so a re-run can retry agents that didn't reach `completed`. |
| 47 | +- **Worktrees live under `.integration/worktree/<tag>`** and are recovered on startup if left over from a crashed run. |
| 48 | + |
| 49 | +## Outputs |
| 50 | + |
| 51 | +Under `<outputDir>/fix/`: |
| 52 | + |
| 53 | +- `agents.log` — interleaved log of all agents, each line prefixed by slug. |
| 54 | +- `agents-status.json` — persisted per-slug status (`dispatched` / `completed` / `failed`). |
| 55 | +- `<tag>-triage.json` — triage agent's classified issues. |
| 56 | +- `<tag>.jsonl` — fix agent's stream-json transcript. |
| 57 | +- `<tag>.stderr` — fix agent's stderr. |
0 commit comments