Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
184 changes: 184 additions & 0 deletions .agents/skills/opencode-session-debugging/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
---
name: opencode-session-debugging
description: "Debug broken OpenCode sessions by locating sessions from an id, title, or directory, recursively inspecting child sessions, correlating transcripts, logs, background tasks, and current OpenCode source, then driving a TDD fix in an isolated PR worktree when code changes are needed. Use whenever the user mentions OpenCode session corruption, background task completion, child sessions, session ids, or a broken session transcript."
---

# OpenCode Session Debugging

Use this skill to investigate a broken OpenCode session from evidence first, then fix the responsible code only after the failure mechanism is pinned by a test.

## Non-Negotiables

1. Inspect the latest OpenCode source before relying on OpenCode behavior.
2. Find every child session recursively. A parent transcript alone is incomplete evidence.
3. Build a timestamped timeline from transcripts, OpenCode logs, plugin logs, and background-task state.
4. Use git history and blame for the failing subsystem before changing code.
5. If a fix is needed, switch to `work-with-pr`: create a new worktree, write the failing test first, open a PR, and do not merge until all required gates pass.
6. Treat raw `session.prompt` and `session.promptAsync` paths as suspect until the shared prompt gate has been audited.

## Phase 0: Capture Inputs

Accept any of these as the starting point:

- Session id, for example `ses_...`
- Session title
- Project directory where the session happened
- A symptom such as "background task completed and the session got corrupted"

If multiple inputs are present, use all of them. Do not ask for a narrower input when a search can disambiguate it.

## Phase 1: Update OpenCode Source

Keep a current OpenCode checkout outside the target repository. Prefer an existing sibling checkout; otherwise clone into `/tmp`.

```bash
# Existing sibling checkout, when present
cd ../opencode
git fetch --all --prune
git pull --ff-only
git rev-parse HEAD

# Fresh temporary checkout, when needed
git clone https://github.com/sst/opencode /tmp/opencode-source
cd /tmp/opencode-source
git fetch --all --prune
git pull --ff-only
git rev-parse HEAD
```

Use this source to verify current session, prompt, logging, storage, and background-task behavior. Do not assume older OpenCode behavior still applies.

## Phase 2: Locate the Session

Search by the strongest identifier first.

Prefer first-class session tools when they are available:

- `session_info` for one known session id
- `session_read` for transcript content
- `session_search` for title, directory, symptom, and message text
- `session_list` to find nearby sessions when only the directory or title is known

Fall back to direct file search when tool access is unavailable or incomplete.

By session id:

```bash
rg -n "ses_[A-Za-z0-9]+" ~/.claude/transcripts ~/.local/share/opencode 2>/dev/null
rg -n "TARGET_SESSION_ID" ~/.claude/transcripts ~/.local/share/opencode .opencode .omo 2>/dev/null
```

By title:

```bash
rg -n "TITLE_FRAGMENT" ~/.claude/transcripts ~/.local/share/opencode .opencode .omo 2>/dev/null
```

By directory:

```bash
rg -n "ABSOLUTE_PROJECT_DIRECTORY" ~/.claude/transcripts ~/.local/share/opencode .opencode .omo 2>/dev/null
```

Common evidence locations:

- `~/.claude/transcripts/*.jsonl`
- `~/.local/share/opencode/log/*.log`
- `~/.local/share/opencode`
- Project `.opencode/background-tasks`
- Project `.omo`
- OS temp logs such as `oh-my-opencode.log`

Record exact paths and timestamps for every artifact used.

## Phase 3: Recursively Find Child Sessions

For each discovered transcript, extract references to:

- `sessionID`, `session_id`, and `ses_...`
- `call_omo_agent`, `task`, delegate, and background-agent tool calls
- Background task ids and parent wake messages
- `session.prompt`, `promptAsync`, idle, error, compacting, and autocontinue events

Search each child session id again through transcripts and logs. Repeat until no new child ids appear. Return a parent-to-child tree with evidence paths.

Use this tree shape:

```text
ses_parent <title or first prompt> [transcript path]
├─ ses_child_a via <task/call/background id> at <timestamp> [transcript path]
│ └─ ses_grandchild via <task/call/background id> at <timestamp> [transcript path]
└─ ses_child_b via <task/call/background id> at <timestamp> [transcript path]
```

## Phase 4: Build the Failure Timeline

Create a timeline with absolute timestamps. Include:

- User prompts and assistant turns
- Background task creation, completion, output delivery, and cancellation
- Parent wake enqueue, dispatch, reservation, requeue, and skip events
- OpenCode session lifecycle events such as idle, error, compacting, and aborted process
- Any duplicate internal prompt or repeated assistant stream

For live or recently finished background tasks, prefer `background_output` to inspect task state before reading disk files. Use `background_cancel` only for an actually running stray task that is part of cleanup, and record why cancellation is safe.

Correlate this timeline with the latest OpenCode source. Label facts from runtime logs separately from inferences from source code.

## Phase 5: Prompt Gate Audit

Session corruption often comes from internal prompt injection. Audit this invariant before changing code:

- Production code may call `session.prompt` or `session.promptAsync` only through the shared gate path.
- New internal message routes must use `dispatchInternalPrompt` or an equivalent gate.
- Suspect patterns: duplicate idle/error/completion edges, `postDispatchHoldMs: 0`, raw prompt fallback when no session state is found, and routes that restore no state when dispatch is skipped.
- Check `src/shared/prompt-async-gate.ts`, `src/shared/prompt-async-route-audit.test.ts`, and the route-specific tests for the failing subsystem.

## Phase 6: History and Blame

Before changing code, run history searches for the failing mechanism:

```bash
git log --oneline -- path/to/suspect-file.ts
git log -S "suspect symbol or string" -- path/to/suspect-file.ts
git log -G "suspect regex" -- path/to/suspect-file.ts
git blame -L START,END path/to/suspect-file.ts
```

Use blame to identify the intent of the current behavior. Use `git show` on relevant commits and compare their tests to the current failure.

## Phase 7: TDD Fix Workflow

If the evidence points to a code defect:

1. Invoke `work-with-pr` and create a fresh task worktree.
2. Write a regression test that fails without the fix.
3. Make the smallest code change that turns the test green.
4. Run focused tests, adjacent tests, typecheck, build, and the full suite when risk warrants it.
5. Open a PR to `dev`.
6. Do not merge until CI, review-work, Cubic, and GPT-5.2 xhigh PR review all pass. If a `.omo/plans/*.md` plan is part of the workflow, Momus may review that plan and must return `[OKAY]`; do not count Momus as a PR diff reviewer.
7. Remove the worktree only after the merge commit succeeds.

## Report Template

Use this shape for the investigation report:

```markdown
Root cause:
<one mechanism, tied to runtime evidence>

Evidence:
- Parent session: <path>
- Child sessions: <ids and paths>
- OpenCode source: <checkout path and commit>
- Timeline: <key timestamped events>
- History: <relevant commits and blame lines>

Fix:
- PR: <number or link>
- Test proving the bug: <test file and test name>
- Validation: <commands and results>

Blocked gates:
- <only include if a required gate could not run or did not pass>
```
56 changes: 48 additions & 8 deletions .agents/skills/work-with-pr/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,19 @@
---
name: work-with-pr
description: "Full PR lifecycle: git worktreeimplement atomic commitsPR creation → verification loop (CI + review-work + Cubic approval) → merge. Keeps iterating until ALL gates pass and PR is merged. Worktree auto-cleanup after merge. Use whenever implementation work needs to land as a PR. Triggers: 'create a PR', 'implement and PR', 'work on this and make a PR', 'implement issue', 'land this as a PR', 'work-with-pr', 'PR workflow', 'implement end to end', even when user just says 'implement X' if the context implies PR delivery."
description: "Full PR lifecycle: always create a new git worktree, implement there, make atomic commits, open a PR to dev, and keep iterating until CI, review-work, Cubic, and GPT-5.2 xhigh PR review all pass. Never merge early. After a merge commit, delete the worktree. Use whenever implementation work needs PR delivery: create a PR, implement and PR, work-with-pr, land this, or implement end to end."
---

# Work With PR — Full PR Lifecycle

You are executing a complete PR lifecycle: from isolated worktree setup through implementation, PR creation, and an unbounded verification loop until the PR is merged. The loop has three gates — CI, review-work, and Cubic — and you keep fixing and pushing until all three pass simultaneously.
You are executing a complete PR lifecycle: isolated worktree setup, implementation, PR creation, verification, merge, and cleanup.

Hard invariants:

1. Always create a fresh git worktree for the task. Never implement in the user's main worktree.
2. Target `dev` unless the user explicitly names another protected base and the repository policy allows it.
3. Do not merge until CI, review-work, Cubic, and GPT-5.2 xhigh PR review all pass on the current head.
4. If any required gate is unavailable, rate-limited, skipped, stale, or ambiguous, do not merge. Report the blocker and leave the worktree for resumption.
5. After a successful merge commit, remove the task worktree and prune stale worktree metadata.

<architecture>

Expand All @@ -16,7 +24,8 @@ Phase 2: PR Creation → Push, create PR targeting dev
Phase 3: Verify Loop → Unbounded iteration until ALL gates pass:
├─ Gate A: CI → gh pr checks (bun test, typecheck, build)
├─ Gate B: review-work → 5-agent parallel review
└─ Gate C: Cubic → cubic-dev-ai[bot] "No issues found"
├─ Gate C: Cubic → cubic-dev-ai[bot] "No issues found"
└─ Gate D: AI review → GPT-5.2 xhigh PR review PASS
Phase 4: Merge → Merge commit, worktree cleanup
```

Expand All @@ -26,7 +35,7 @@ Phase 4: Merge → Merge commit, worktree cleanup

## Phase 0: Setup

Create an isolated worktree so the user's main working directory stays clean. This matters because the user may have uncommitted work, and checking out a branch would destroy it.
Create an isolated worktree so the user's main working directory stays clean. This is mandatory even for tiny changes because the user may have uncommitted work, other agents may be active, and PR cleanup depends on having a disposable task directory.

<setup>

Expand Down Expand Up @@ -160,7 +169,7 @@ PR_NUMBER=$(gh pr view --json number -q .number)

## Phase 3: Verification Loop

This is the core of the skill. Three gates must ALL pass for the PR to be ready. The loop has no iteration cap — keep going until done. Gate ordering is intentional: CI is cheapest/fastest, review-work is most thorough, Cubic is external and asynchronous.
This is the core of the skill. Four gates must ALL pass for the PR to be ready. The loop has no iteration cap — keep going until done. Gate ordering is intentional: CI is cheapest/fastest, review-work is broadest, Cubic is external and asynchronous, and the final AI review catches reasoning gaps.

<verify_loop>

Expand All @@ -172,7 +181,9 @@ while true:
4. If review fails → fix blocking issues, commit, push, continue
5. Check Cubic → Gate C
6. If Cubic has issues → fix issues, commit, push, continue
7. All three pass → break
7. Run GPT-5.2 xhigh PR review → Gate D
8. If AI review fails → fix blockers, commit, push, continue
9. All four pass → break
```

### Gate A: CI Checks
Expand Down Expand Up @@ -223,6 +234,8 @@ Cubic (`cubic-dev-ai[bot]`) is an automated review bot that comments on PRs. It

**Issue signal**: The comment lists issues with file-level detail.

**Not a pass**: neutral/skipped check runs, "started review" comments, billing or line-limit messages, stale comments from an older head SHA, or a generated PR summary without a review verdict.

```bash
# Get the latest Cubic review
CUBIC_REVIEW=$(gh api "repos/${REPO}/pulls/${PR_NUMBER}/reviews" \
Expand Down Expand Up @@ -255,6 +268,31 @@ while true; do
done
```

### Gate D: GPT-5.2 xhigh PR Review

Run one independent final reviewer after CI, review-work, and Cubic pass. The reviewer must inspect the current PR head, not a stale local diff.

Acceptable pass signal:

- A GPT-5.2 reviewer running with xhigh reasoning returns `VERDICT: PASS`.

Momus is not a substitute for this PR-diff gate. Momus reviews `.omo/plans/*.md` execution plans and uses `[OKAY]` or `[REJECT]`; count it only as an additional plan gate when a plan review is explicitly in scope.

Review prompt template:

```text
Read-only review gate for PR #{PR_NUMBER}. Do not edit, commit, or push.
Worktree: {WORKTREE_PATH}
Base: origin/{BASE_BRANCH}
Head: {BRANCH_NAME}
Goal: {ORIGINAL_GOAL}

Inspect the diff and directly relevant files/tests. Return first line exactly
VERDICT: PASS or VERDICT: FAIL. On FAIL, include file/line blockers only.
```

On failure, fix only the blocking issues, commit atomically, push, and restart from Gate A.

### Iteration discipline

Each iteration through the loop:
Expand All @@ -271,7 +309,7 @@ Avoid the temptation to "improve" unrelated code during fix iterations. Scope cr

## Phase 4: Merge & Cleanup

Once all three gates pass:
Once all four gates pass:

<merge_cleanup>

Expand Down Expand Up @@ -315,7 +353,7 @@ Summarize what happened:
- **PR**: #{PR_NUMBER} — {PR_TITLE}
- **Branch**: {BRANCH_NAME} → {BASE_BRANCH}
- **Iterations**: {N} verification loops
- **Gates passed**: CI | review-work | Cubic
- **Gates passed**: CI passed | review-work passed | Cubic passed | AI review passed
- **Worktree**: cleaned up
```

Expand Down Expand Up @@ -353,6 +391,8 @@ git rebase "origin/$BASE_BRANCH"
| Working in main worktree instead of isolated worktree | Pollutes user's working directory, may destroy uncommitted work | CRITICAL |
| Pushing directly to dev/master | Bypasses review entirely | CRITICAL |
| Skipping CI gate after code changes | review-work and Cubic may pass on stale code | CRITICAL |
| Merging while Cubic is skipped, rate-limited, or ambiguous | Required external review did not happen | CRITICAL |
| Counting Momus as the PR-diff reviewer | Momus reviews plans, not PR diffs | CRITICAL |
| Fixing unrelated code during verification loop | Scope creep causes new failures | HIGH |
| Deleting worktree on failure | User loses ability to inspect/resume | HIGH |
| Ignoring Cubic false positives without justification | Cubic issues should be evaluated, not blindly dismissed | MEDIUM |
Expand Down
Loading
Loading