code-yeongyu · herjarsa · May 23, 2026 · May 23, 2026 · May 23, 2026 · May 23, 2026
diff --git a/.agents/skills/opencode-session-debugging/SKILL.md b/.agents/skills/opencode-session-debugging/SKILL.md
@@ -0,0 +1,184 @@
+---
+name: opencode-session-debugging
+description: "Debug broken OpenCode sessions by locating sessions from an id, title, or directory, recursively inspecting child sessions, correlating transcripts, logs, background tasks, and current OpenCode source, then driving a TDD fix in an isolated PR worktree when code changes are needed. Use whenever the user mentions OpenCode session corruption, background task completion, child sessions, session ids, or a broken session transcript."
+---
+
+# OpenCode Session Debugging
+
+Use this skill to investigate a broken OpenCode session from evidence first, then fix the responsible code only after the failure mechanism is pinned by a test.
+
+## Non-Negotiables
+
+1. Inspect the latest OpenCode source before relying on OpenCode behavior.
+2. Find every child session recursively. A parent transcript alone is incomplete evidence.
+3. Build a timestamped timeline from transcripts, OpenCode logs, plugin logs, and background-task state.
+4. Use git history and blame for the failing subsystem before changing code.
+5. If a fix is needed, switch to `work-with-pr`: create a new worktree, write the failing test first, open a PR, and do not merge until all required gates pass.
+6. Treat raw `session.prompt` and `session.promptAsync` paths as suspect until the shared prompt gate has been audited.
+
+## Phase 0: Capture Inputs
+
+Accept any of these as the starting point:
+
+- Session id, for example `ses_...`
+- Session title
+- Project directory where the session happened
+- A symptom such as "background task completed and the session got corrupted"
+
+If multiple inputs are present, use all of them. Do not ask for a narrower input when a search can disambiguate it.
+
+## Phase 1: Update OpenCode Source
+
+Keep a current OpenCode checkout outside the target repository. Prefer an existing sibling checkout; otherwise clone into `/tmp`.
+
+```bash
+# Existing sibling checkout, when present
+cd ../opencode
+git fetch --all --prune
+git pull --ff-only
+git rev-parse HEAD
+
+# Fresh temporary checkout, when needed
+git clone https://github.com/sst/opencode /tmp/opencode-source
+cd /tmp/opencode-source
+git fetch --all --prune
+git pull --ff-only
+git rev-parse HEAD
+```
+
+Use this source to verify current session, prompt, logging, storage, and background-task behavior. Do not assume older OpenCode behavior still applies.
+
+## Phase 2: Locate the Session
+
+Search by the strongest identifier first.
+
+Prefer first-class session tools when they are available:
+
+- `session_info` for one known session id
+- `session_read` for transcript content
+- `session_search` for title, directory, symptom, and message text
+- `session_list` to find nearby sessions when only the directory or title is known
+
+Fall back to direct file search when tool access is unavailable or incomplete.
+
+By session id:
+
+```bash
+rg -n "ses_[A-Za-z0-9]+" ~/.claude/transcripts ~/.local/share/opencode 2>/dev/null
+rg -n "TARGET_SESSION_ID" ~/.claude/transcripts ~/.local/share/opencode .opencode .omo 2>/dev/null
+```
+
+By title:
+
+```bash
+rg -n "TITLE_FRAGMENT" ~/.claude/transcripts ~/.local/share/opencode .opencode .omo 2>/dev/null
+```
+
+By directory:
+
+```bash
+rg -n "ABSOLUTE_PROJECT_DIRECTORY" ~/.claude/transcripts ~/.local/share/opencode .opencode .omo 2>/dev/null
+```
+
+Common evidence locations:
+
+- `~/.claude/transcripts/*.jsonl`
+- `~/.local/share/opencode/log/*.log`
+- `~/.local/share/opencode`
+- Project `.opencode/background-tasks`
+- Project `.omo`
+- OS temp logs such as `oh-my-opencode.log`
+
+Record exact paths and timestamps for every artifact used.
+
+## Phase 3: Recursively Find Child Sessions
+
+For each discovered transcript, extract references to:
+
+- `sessionID`, `session_id`, and `ses_...`
+- `call_omo_agent`, `task`, delegate, and background-agent tool calls
+- Background task ids and parent wake messages
+- `session.prompt`, `promptAsync`, idle, error, compacting, and autocontinue events
+
+Search each child session id again through transcripts and logs. Repeat until no new child ids appear. Return a parent-to-child tree with evidence paths.
+
+Use this tree shape:
+
+```text
+ses_parent <title or first prompt> [transcript path]
+├─ ses_child_a via <task/call/background id> at <timestamp> [transcript path]
+│  └─ ses_grandchild via <task/call/background id> at <timestamp> [transcript path]
+└─ ses_child_b via <task/call/background id> at <timestamp> [transcript path]
+```
+
+## Phase 4: Build the Failure Timeline
+
+Create a timeline with absolute timestamps. Include:
+
+- User prompts and assistant turns
+- Background task creation, completion, output delivery, and cancellation
+- Parent wake enqueue, dispatch, reservation, requeue, and skip events
+- OpenCode session lifecycle events such as idle, error, compacting, and aborted process
+- Any duplicate internal prompt or repeated assistant stream
+
+For live or recently finished background tasks, prefer `background_output` to inspect task state before reading disk files. Use `background_cancel` only for an actually running stray task that is part of cleanup, and record why cancellation is safe.
+
+Correlate this timeline with the latest OpenCode source. Label facts from runtime logs separately from inferences from source code.
+
+## Phase 5: Prompt Gate Audit
+
+Session corruption often comes from internal prompt injection. Audit this invariant before changing code:
+
+- Production code may call `session.prompt` or `session.promptAsync` only through the shared gate path.
+- New internal message routes must use `dispatchInternalPrompt` or an equivalent gate.
+- Suspect patterns: duplicate idle/error/completion edges, `postDispatchHoldMs: 0`, raw prompt fallback when no session state is found, and routes that restore no state when dispatch is skipped.
+- Check `src/shared/prompt-async-gate.ts`, `src/shared/prompt-async-route-audit.test.ts`, and the route-specific tests for the failing subsystem.
+
+## Phase 6: History and Blame
+
+Before changing code, run history searches for the failing mechanism:
+
+```bash
+git log --oneline -- path/to/suspect-file.ts
+git log -S "suspect symbol or string" -- path/to/suspect-file.ts
+git log -G "suspect regex" -- path/to/suspect-file.ts
+git blame -L START,END path/to/suspect-file.ts
+```
+
+Use blame to identify the intent of the current behavior. Use `git show` on relevant commits and compare their tests to the current failure.
+
+## Phase 7: TDD Fix Workflow
+
+If the evidence points to a code defect:
+
+1. Invoke `work-with-pr` and create a fresh task worktree.
+2. Write a regression test that fails without the fix.
+3. Make the smallest code change that turns the test green.
+4. Run focused tests, adjacent tests, typecheck, build, and the full suite when risk warrants it.
+5. Open a PR to `dev`.
+6. Do not merge until CI, review-work, Cubic, and GPT-5.2 xhigh PR review all pass. If a `.omo/plans/*.md` plan is part of the workflow, Momus may review that plan and must return `[OKAY]`; do not count Momus as a PR diff reviewer.
+7. Remove the worktree only after the merge commit succeeds.
+
+## Report Template
+
+Use this shape for the investigation report:
+
+```markdown
+Root cause:
+<one mechanism, tied to runtime evidence>
+
+Evidence:
+- Parent session: <path>
+- Child sessions: <ids and paths>
+- OpenCode source: <checkout path and commit>
+- Timeline: <key timestamped events>
+- History: <relevant commits and blame lines>
+
+Fix:
+- PR: <number or link>
+- Test proving the bug: <test file and test name>
+- Validation: <commands and results>
+
+Blocked gates:
+- <only include if a required gate could not run or did not pass>
+```
diff --git a/.agents/skills/work-with-pr/SKILL.md b/.agents/skills/work-with-pr/SKILL.md
@@ -1,11 +1,19 @@
 ---
 name: work-with-pr
-description: "Full PR lifecycle: git worktree → implement → atomic commits → PR creation → verification loop (CI + review-work + Cubic approval) → merge. Keeps iterating until ALL gates pass and PR is merged. Worktree auto-cleanup after merge. Use whenever implementation work needs to land as a PR. Triggers: 'create a PR', 'implement and PR', 'work on this and make a PR', 'implement issue', 'land this as a PR', 'work-with-pr', 'PR workflow', 'implement end to end', even when user just says 'implement X' if the context implies PR delivery."
+description: "Full PR lifecycle: always create a new git worktree, implement there, make atomic commits, open a PR to dev, and keep iterating until CI, review-work, Cubic, and GPT-5.2 xhigh PR review all pass. Never merge early. After a merge commit, delete the worktree. Use whenever implementation work needs PR delivery: create a PR, implement and PR, work-with-pr, land this, or implement end to end."
 ---
 
 # Work With PR — Full PR Lifecycle
 
-You are executing a complete PR lifecycle: from isolated worktree setup through implementation, PR creation, and an unbounded verification loop until the PR is merged. The loop has three gates — CI, review-work, and Cubic — and you keep fixing and pushing until all three pass simultaneously.
+You are executing a complete PR lifecycle: isolated worktree setup, implementation, PR creation, verification, merge, and cleanup.
+
+Hard invariants:
+
+1. Always create a fresh git worktree for the task. Never implement in the user's main worktree.
+2. Target `dev` unless the user explicitly names another protected base and the repository policy allows it.
+3. Do not merge until CI, review-work, Cubic, and GPT-5.2 xhigh PR review all pass on the current head.
+4. If any required gate is unavailable, rate-limited, skipped, stale, or ambiguous, do not merge. Report the blocker and leave the worktree for resumption.
+5. After a successful merge commit, remove the task worktree and prune stale worktree metadata.
 
 <architecture>
 
@@ -16,7 +24,8 @@ Phase 2: PR Creation   → Push, create PR targeting dev
 Phase 3: Verify Loop   → Unbounded iteration until ALL gates pass:
   ├─ Gate A: CI         → gh pr checks (bun test, typecheck, build)
   ├─ Gate B: review-work → 5-agent parallel review
-  └─ Gate C: Cubic      → cubic-dev-ai[bot] "No issues found"
+  ├─ Gate C: Cubic      → cubic-dev-ai[bot] "No issues found"
+  └─ Gate D: AI review  → GPT-5.2 xhigh PR review PASS
 Phase 4: Merge         → Merge commit, worktree cleanup
 ```
 
@@ -26,7 +35,7 @@ Phase 4: Merge         → Merge commit, worktree cleanup
 
 ## Phase 0: Setup
 
-Create an isolated worktree so the user's main working directory stays clean. This matters because the user may have uncommitted work, and checking out a branch would destroy it.
+Create an isolated worktree so the user's main working directory stays clean. This is mandatory even for tiny changes because the user may have uncommitted work, other agents may be active, and PR cleanup depends on having a disposable task directory.
 
 <setup>
 
@@ -160,7 +169,7 @@ PR_NUMBER=$(gh pr view --json number -q .number)
 
 ## Phase 3: Verification Loop
 
-This is the core of the skill. Three gates must ALL pass for the PR to be ready. The loop has no iteration cap — keep going until done. Gate ordering is intentional: CI is cheapest/fastest, review-work is most thorough, Cubic is external and asynchronous.
+This is the core of the skill. Four gates must ALL pass for the PR to be ready. The loop has no iteration cap — keep going until done. Gate ordering is intentional: CI is cheapest/fastest, review-work is broadest, Cubic is external and asynchronous, and the final AI review catches reasoning gaps.
 
 <verify_loop>
 
@@ -172,7 +181,9 @@ while true:
   4. If review fails      → fix blocking issues, commit, push, continue
   5. Check Cubic          → Gate C
   6. If Cubic has issues   → fix issues, commit, push, continue
-  7. All three pass       → break
+  7. Run GPT-5.2 xhigh PR review → Gate D
+  8. If AI review fails    → fix blockers, commit, push, continue
+  9. All four pass         → break
 ```
 
 ### Gate A: CI Checks
@@ -223,6 +234,8 @@ Cubic (`cubic-dev-ai[bot]`) is an automated review bot that comments on PRs. It
 
 **Issue signal**: The comment lists issues with file-level detail.
 
+**Not a pass**: neutral/skipped check runs, "started review" comments, billing or line-limit messages, stale comments from an older head SHA, or a generated PR summary without a review verdict.
+
 ```bash
 # Get the latest Cubic review
 CUBIC_REVIEW=$(gh api "repos/${REPO}/pulls/${PR_NUMBER}/reviews" \
@@ -255,6 +268,31 @@ while true; do
 done
 ```
 
+### Gate D: GPT-5.2 xhigh PR Review
+
+Run one independent final reviewer after CI, review-work, and Cubic pass. The reviewer must inspect the current PR head, not a stale local diff.
+
+Acceptable pass signal:
+
+- A GPT-5.2 reviewer running with xhigh reasoning returns `VERDICT: PASS`.
+
+Momus is not a substitute for this PR-diff gate. Momus reviews `.omo/plans/*.md` execution plans and uses `[OKAY]` or `[REJECT]`; count it only as an additional plan gate when a plan review is explicitly in scope.
+
+Review prompt template:
+
+```text
+Read-only review gate for PR #{PR_NUMBER}. Do not edit, commit, or push.
+Worktree: {WORKTREE_PATH}
+Base: origin/{BASE_BRANCH}
+Head: {BRANCH_NAME}
+Goal: {ORIGINAL_GOAL}
+
+Inspect the diff and directly relevant files/tests. Return first line exactly
+VERDICT: PASS or VERDICT: FAIL. On FAIL, include file/line blockers only.
+```
+
+On failure, fix only the blocking issues, commit atomically, push, and restart from Gate A.
+
 ### Iteration discipline
 
 Each iteration through the loop:
@@ -271,7 +309,7 @@ Avoid the temptation to "improve" unrelated code during fix iterations. Scope cr
 
 ## Phase 4: Merge & Cleanup
 
-Once all three gates pass:
+Once all four gates pass:
 
 <merge_cleanup>
 
@@ -315,7 +353,7 @@ Summarize what happened:
 - **PR**: #{PR_NUMBER} — {PR_TITLE}
 - **Branch**: {BRANCH_NAME} → {BASE_BRANCH}
 - **Iterations**: {N} verification loops
-- **Gates passed**: CI ✅ | review-work ✅ | Cubic ✅
+- **Gates passed**: CI passed | review-work passed | Cubic passed | AI review passed
 - **Worktree**: cleaned up
 ```
 
@@ -353,6 +391,8 @@ git rebase "origin/$BASE_BRANCH"
 | Working in main worktree instead of isolated worktree | Pollutes user's working directory, may destroy uncommitted work | CRITICAL |
 | Pushing directly to dev/master | Bypasses review entirely | CRITICAL |
 | Skipping CI gate after code changes | review-work and Cubic may pass on stale code | CRITICAL |
+| Merging while Cubic is skipped, rate-limited, or ambiguous | Required external review did not happen | CRITICAL |
+| Counting Momus as the PR-diff reviewer | Momus reviews plans, not PR diffs | CRITICAL |
 | Fixing unrelated code during verification loop | Scope creep causes new failures | HIGH |
 | Deleting worktree on failure | User loses ability to inspect/resume | HIGH |
 | Ignoring Cubic false positives without justification | Cubic issues should be evaluated, not blindly dismissed | MEDIUM |