You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/mship/skills/systematic-debugging/SKILL.md
+32Lines changed: 32 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -212,6 +212,38 @@ You MUST complete each phase before proceeding to the next.
212
212
213
213
This is NOT a failed hypothesis - this is a wrong architecture.
214
214
215
+
## mship integration (REQUIRED when mship is present)
216
+
217
+
If `mship` is available in PATH and the current working directory is inside an mship workspace, you MUST invoke the tool at each methodology checkpoint. This generates the durable audit trace the supervisor relies on and enables tree-compilation tools to reconstruct your debugging path.
`<ref>` is a free-form pointer like `test-runs/5`, `HEAD`, or `<path>:<start>-<end>`. `--id` is optional (mship auto-generates an 8-char hex handle); use it when you want human-readable references (e.g. `--id h1`).
`--parent` points at the hypothesis id you are refuting. `--category` is optional; adopt the AgentRx failure taxonomy (e.g. `intent-plan-misalignment`, `tool-output-misread`, `invention-of-new-information`) if doing cross-session analysis.
230
+
231
+
-**When closing the investigation:**
232
+
```
233
+
mship debug resolved "<root cause and fix summary>"
234
+
```
235
+
Only an explicit `debug-resolved` closes the thread. A passing test run does NOT implicitly close it — write the explicit entry so the supervisor can audit what you concluded.
236
+
237
+
-**While running tests:**`mship test` during an open debug thread automatically enriches its journal entry with `parent=<latest-hypothesis-id>`. Nothing extra for you to do.
238
+
239
+
### If mship is NOT available
240
+
241
+
Non-mship projects, mship not on PATH, or running outside a workspace: fall back to the methodology as described in the rest of this skill. Log hypotheses and rule-outs as inline notes, commit messages, or PR comments — whatever durable medium is available. The structure (hypothesis → evidence → rule-out → resolution) stays the same; the storage differs.
242
+
243
+
### Why tight coupling here
244
+
245
+
Research from 2026 agentic-coding literature (Debug-gym, AgentRx, SWE-agent) is unambiguous: for debugging specifically, loosely coupled tools get skipped under context pressure, and the diagnostic trail vanishes. Mandating invocation at each step produces a verifiable trace the supervisor can reconstruct — which is the whole point of doing systematic debugging in the first place.
0 commit comments