Skip to content

3ebb resilience landing (direct-to-main, manual bypass-merge) — fixes live e1bf + lands DD1-3/9d2c/284b#688

Merged
JoeOakhartNava merged 18 commits into
mainfrom
story/588e-abec-221f-42c0/3ebb-resilience-dd123-7
Jun 6, 2026
Merged

3ebb resilience landing (direct-to-main, manual bypass-merge) — fixes live e1bf + lands DD1-3/9d2c/284b#688
JoeOakhartNava merged 18 commits into
mainfrom
story/588e-abec-221f-42c0/3ebb-resilience-dd123-7

Conversation

@joeoakhart

Copy link
Copy Markdown
Collaborator

3ebb resilience landing — direct PR to main for manual bypass-merge

Why this PR exists (not the normal two-tier flow): the staged two-tier merge-to-main flow livelockedreview-sub-pr FP-amplified (escalating, re-litigated findings) so each landing attempt was slow enough that main advanced and re-stale-d the staged branch (3× determinate version-only conflicts: PRs #682/#683/#687). Per maintainer instruction, opening a direct PR to main for a manual bypass-merge.

The content is twice fp-recovery-cleared (0 findings):

What lands here (15 files; rebased onto current main):

  • 3ebb DD1–3 (tristate review-gate lattice, durable --resume INC-008/INC-015, escalation)
  • staged-force-push fix _branch_is_staged_promotionfixes the live e1bf GH013 regression on main
  • fix A (9d2c) schema-error diagnosability + O5 (284b) fallback_hops allowlist
  • 2 hardening fixes (str()-coerce schema errors; structural-error fail-closed on mergeability poll)

Bypass note: review-sub-pr/llm-review here is the FP-amplified check; all other required checks should be green. Manual web-UI bypass-merge by the named bypass actor. Closes the dd123-5/-6/-7 re-stage loop.

Test and others added 18 commits June 5, 2026 20:11
3ebb DD1 (resilience-harden, part 1): codify the review-gate decidability
lattice PASS/FAIL/INDETERMINATE and retrofit the first named Goal-1 gate.

- review-tristate-lib.sh: lattice helper (generalizes precondition-gate.sh
  exit-78). tristate_classify_verdict (unreviewed>0 -> FAIL/1 safe bottom;
  errors-only -> INDETERMINATE/75; clean -> PASS/0), tristate_is_transient_error
  (5xx/429/rate-limit/timeout/reset/gateway only), tristate_code_to_name.
- review-tristate-lattice.md: review-specific contract + security guardrail.
- review-coverage-invariant.sh: separates could-not-confirm (INDETERMINATE/75)
  from a genuine unreviewed SHA (FAIL/1); both still block under enforce.

TDD: test-review-tristate-lib.sh 20/20 (new); coverage-invariant T4 -> 75,
T4b violation-dominates -> 1; harness hermetic.

Refs 3ebb-fa93-f980-40de DD1.
DSO-Story: 3ebb-fa93-f980-40de
…arker-absent

Completes 3ebb DD1: retrofit the second named Goal-1 gate to the tristate lattice.

The marker-absent path (provenance verifier crashed / never ran) previously
exited 1 — conflated with the normal 'dispatch runner' exit 1. It is genuinely
'could not COMPUTE the dispatch decision' -> INDETERMINATE (exit 75): still
blocks (never the 8a77 silent-skip exit 0), but a distinct signal the
orchestrator may retry on a transient cause / route to escalation (DD3). Sources
review-tristate-lib.sh script-relative (hermetic under test).

TDD: test-llm-review-dispatch-or-skip.sh Test 9 updated (marker-absent -> 75);
43/43 green.

Refs 3ebb-fa93-f980-40de DD1.
DSO-Story: 3ebb-fa93-f980-40de
… never authoritative-empty

3ebb DD2 (durable --resume): treat a GitHub PR-state read as authoritative only
when the call succeeded AND returned well-formed JSON. A non-zero rc, an empty
payload, or truncated/malformed JSON (a page cut short) is INDETERMINATE.

- resume_pr_query_trustworthy (pure, INC-008 guard): rc 0 + valid JSON (incl. a
  legitimately-empty '[]') -> trustworthy; otherwise -> 75 (INDETERMINATE). The
  empty-but-valid '[]' case is deliberately distinguished from a truncated read.
- advance-to-PR2 wiring: the PR1-state read now routes through the guard. On an
  indeterminate read it re-fetches once, then HALTS as INDETERMINATE (exit 75)
  rather than GUESS a direction — collapsing partial state into '0 PRs' would
  advance onto unverified state (skip review) and into '>0' would mint a
  duplicate staged-*+PR1. Neither is safe to guess (INC-008 / INC-015 stall,
  hit twice this session).

TDD: test-merge-to-main-resume-inc008.sh 8/8 (new); resume regressions green
(staged-advance 12/12, existing-pr-discovery 6/6). No new shellcheck findings.

Refs 3ebb-fa93-f980-40de DD2.
DSO-Story: 3ebb-fa93-f980-40de
…dicts

3ebb DD3: an INDETERMINATE verdict from any gate routes to /dso:fp-recovery via
a single standardized, greppable (telemetry-ready) marker — never manual git
surgery. Last-resort by design: fires only after transient retries are spent; a
genuine FAIL never escalates here.

- tristate_indeterminate_escalation (review-tristate-lib.sh): emits the
  INDETERMINATE_ESCALATION JSON marker + an actionable banner; JSON-safe reason.
- wired at all three INDETERMINATE sites: review-coverage-invariant.sh
  (INDETERMINATE-enforce exit), llm-review-dispatch-or-skip.sh (marker-absent),
  merge-to-main-pr.sh (INC-008 resume halt, after one re-fetch).
- merge-to-main-pr.sh now sources review-tristate-lib.sh (script-relative).
- contract doc documents the escalation surface + the anti-friction stance
  (escalation rate is a defect signal, not normal operation).

TDD: test-review-tristate-lib.sh 23/23 (T8/T8b/T8c added); coverage-invariant
14/14, dispatcher 43/43, inc008 8/8, staged-advance 12/12 — all green.

Refs 3ebb-fa93-f980-40de DD3.
DSO-Story: 3ebb-fa93-f980-40de
…/tmp is a cache)

Completes 3ebb DD2: treat /tmp/merge-to-main-state-*.json as a CACHE, not the
source of truth (GitHub + git are), and kill the INC-015 stale-checkout stall.

- resume_staged_ref_is_spent (pure): a staged-* ref with 0 commits ahead of
  origin/main is SPENT (already merged / empty). Fail-safe: unknown count is NOT
  spent (never skip/GC on uncertainty).
- resume_gc_stale_staged_state: GC /tmp cache entries for staged-* branches that
  are GONE from origin or SPENT, so a later --resume never reconstructs off a
  completed promotion's stale state.
- current-branch assertion (INC-015 (a)): before the PR1/PR2 phase, if the
  worktree sits on a SPENT staged-* ref, emit a concrete recovery (checkout your
  feature branch) and exit 1 + GC the stale cache — instead of the confusing
  'No commits between main and staged-*' stall. Deliberately NOT routed to
  /dso:fp-recovery: a wrong-branch checkout is a DETERMINATE operational error,
  not an INDETERMINATE verdict (anti-over-reliance, per the friction review).

TDD: test-merge-to-main-resume-inc008.sh 17/17 (predicate T8-T11 + GC T12-T14);
resume regressions green (staged-advance 12, resume-and-stale 7, cwd 7,
pr2-statefile 2, existing-pr-discovery 6, freshness 6). No new shellcheck.

Refs 3ebb-fa93-f980-40de DD2.
DSO-Story: 3ebb-fa93-f980-40de
CI ShellCheck (warning threshold) flagged SC2120 — resume_gc_stale_staged_state
references $1 but production callers use the /tmp default; only the test passes
an explicit dir. Add a scoped shellcheck disable documenting the intentional
optional arg. Comment-only; no behavior change.

DSO-Story: 3ebb-fa93-f980-40de
…branch fix)

CI Script Tests failed T14_gc_keeps_live: the fixture used 'git branch
staged-spent main' / 'checkout -b staged-live main', but CI's
init.defaultBranch=master means no local 'main' existed -> 'src refspec does
not match any' -> branches never created -> GC removed their (gone) cache.
Push HEAD directly to named remote refs instead; no local-branch-name
dependency. Verified 17/17 both normally and with init.defaultBranch=master.
Production GC code unchanged (it was correct).

DSO-Story: 3ebb-fa93-f980-40de
…nate conflict recovery

cca8-DD1 follow-on (vetted via the epic 3-lens panel + convergence reviewer).
When origin/main advanced mid-PR2, _phase_merge rebased the staged-* ref and
force-pushed it; the staged-* ruleset requires review-sub-pr on pushes (a check
that only runs on pull_request events, never on a push), so the force-push was
structurally un-satisfiable -> GH013 rejection -> pipeline died with a misleading
CONFLICT_DATA.

- _branch_is_staged_promotion: single-source-of-truth predicate; folds the 4
  scattered '[[ $BRANCH == staged-* ]]' literals (N1) so PR2-phase semantics
  cannot drift.
- R-A: in _phase_merge, SKIP the entire publish block (sync + source-branch
  version-bump + push) when on a staged-* promotion. staged was already
  created/pushed/reviewed by PR1; GitHub PR2 rebase-merge replays it onto current
  main at merge time. Prevents the GH013 from relocating to the plain-push path.
- R-C: a true PR2 CONFLICTING is a DETERMINATE operational error -> concrete
  re-stage RECOVERY + exit 1, NOT routed to /dso:fp-recovery (which cannot rebase
  staged); poll until mergeable is computed first (N2).
- BEHIND-base refresh is intentionally NOT duplicated here — the existing
  auto-merge poll loop already calls gh pr update-branch (jira-dig-2529).

TDD: test-merge-to-main-resume-inc008.sh +6 predicate cases (both polarities,
23/23); linear-sync 6/6, staged-advance 12/12, resume-and-stale 7/7 green.
shellcheck -S warning clean.

Refs 3ebb-fa93-f980-40de; origin cca8-f1ed-ddd3-48f1 DD1.
DSO-Story: 3ebb-fa93-f980-40de
…imitations

Document the two race classes the two-tier-over-merge-queue choice does NOT
serialize away (consequences of declining a merge queue, mitigated by quiescing):
- L1: concurrent PR2s can be individually-green but semantically conflict on main
  (the merge-queue serialization this ADR traded away). Mitigation: one main-bound
  promotion at a time.
- L2: the 3ee4 enforce-flip races in-flight PR2s during the live<->file skew
  window. Mitigation: runbook quiesce (no PR2 in flight during the swap).
Both are correctness/operational limits, not integrity regressions (no unreviewed
content can reach main). Surfaced per a workflow-resilience review so they are not
rediscovered as surprises; the 3ebb DD3 drift-reconciler stretch goal is the place
to revisit if the concurrency cost becomes material.

DSO-Story: 3ebb-fa93-f980-40de
CI ShellCheck (warning) flagged SC2120 — _branch_is_staged_promotion references
$1 but production callers use the $BRANCH default; only the test passes an
explicit arg. Add the scoped disable (same pattern as resume_gc_stale_staged_state).
Comment-only; verified SC2120=0 at default severity (CI-equivalent); remaining 8
findings are pre-existing SC2004/SC2006 style (below CI's failure threshold).

DSO-Story: 3ebb-fa93-f980-40de
- F3 (real): add a cleanup trap for the inc008 GC temp dir so /tmp is not left
  stale on early exit/interrupt.
- F4 (real): resolve the tristate test's lib path via $BASH_SOURCE (script-
  relative) instead of 'git rev-parse --show-toplevel', robust under worktrees.
- F1 (relaxed): T4/T4b assert the behavioral exit code (75=INDETERMINATE,
  1=FAIL-dominates) + the offending SHA's presence, dropping the prose verdict
  word to reduce output-string coupling.
F2 (dispatcher marker assertion) defended: it asserts exit 75 + the stable
'provenance-complete.marker' filename (a contract path, not prose) — kept.

Tests green: tristate 23/23, inc008 23/23, coverage-invariant 14/14.
DSO-Story: 3ebb-fa93-f980-40de
…er, GC fail-safe, poll sleep)

Three real findings from review-sub-pr on PR #676 (production code):
- tristate_is_transient_error: the 5xx/429 globs (*50[0-9]*, *429*) matched a
  status code EMBEDDED in a larger number (e.g. '504800 bytes') -> could flip a
  non-transient INDETERMINATE into a retry, violating lattice rule 4. Anchor the
  numeric match on non-digit boundaries (grep -E '(^|[^0-9])(5[0-9]{2}|429)([^0-9]|$)').
  +4 edge-case tests (byte counts / offsets / SHA fragments stay non-transient); 27/27.
- resume_gc_stale_staged_state: an empty ls-remote was treated unconditionally as
  'branch gone -> GC cache', so a TRANSIENT ls-remote failure could delete a live
  staged ref's cache. Now distinguishes rc!=0 (transient -> keep, fail-safe) from
  rc==0+empty (truly gone -> GC).
- mergeability poll: skip the trailing sleep after the final (6th) attempt.

DSO-Story: 3ebb-fa93-f980-40de
- test-llm-review-dispatch-or-skip (T9 no-marker path): assert stderr emits the
  INDETERMINATE_ESCALATION: machine-routing marker (DD3), not just exit 75 — a
  partial impl that exits 75 without escalating would otherwise pass silently.
- test-review-coverage-invariant (T4b): assert BOTH range SHAs are walked
  (UNREVIEWED sha1 + INDETERMINATE sha2 + summary unreviewed=1 errors=1) so the
  violation-dominates-error scenario is genuinely exercised, not just sha2.
- test-merge-to-main-resume-inc008: isolate git config (GIT_CONFIG_GLOBAL/SYSTEM
  =/dev/null) so ambient ~/.gitconfig or /etc/gitconfig hooks/signing/aliases
  cannot leak into the init/clone/commit/push test setup.

All green: dispatch 44, coverage-invariant 14, inc008 23. shellcheck SC2120=0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The schema-correction-exhausted fail-closed path in the dso_ci_review runner
emitted only a generic "exhausted all N attempt(s)" message, omitting
_schema_result.errors. The violating field/value lived solely in the synthetic
schema_error finding description (written to the findings JSON, which CI does
not upload), so the actual violation was unrecoverable from the CI job log.

- runner.py: append the (truncated, max 10 + overflow) schema errors to the
  fail-closed stderr message. Enforcement is unchanged — still returns 1.
- runner.py: reconcile get_schema_correction_max_attempts docstring
  (default 1 -> 2) to match the code (read_config_int default 2).
- add test_runner_schema_exhaustion_stderr.py (RED->GREEN): asserts the schema
  error detail appears in stderr AND exit is non-zero (fail-closed preserved).

Diagnosability-only: does not change which findings block. Follow-up
284b-2f0a-3b12-41d6 tracks the functional convergence fix (frozen-field
re-rejection), blocked until a captured payload exists — which this enables.

Antipattern-Scan: fail-closed-swallows-schema-errors root=plugins/dso/scripts/dso_ci_review matches=0

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (9d2c)

Address review finding on PR #680: the single-error test did not exercise the
truncation logic. Add test_schema_exhaustion_stderr_truncates_and_counts_overflow
(12 distinct errors) asserting the REAL formatting in runner.py — first 10
present, 11th/12th truncated, "(+2 more)" overflow suffix — so a broken slice or
overflow count now fails the test. Refactor shared setup into _run_exhaustion_path.

The pre-existing single-error test already pins `_err_preview` interpolation (it
fails if the errors are not interpolated into the message). Mocking
_validate_findings_schema (external subprocess) and dispatch_schema_correction
(live LLM dispatch) is appropriate unit isolation; the assertions verify the
runner's own formatting output, not mock passthrough.

Antipattern-Scan: fail-closed-swallows-schema-errors root=plugins/dso/scripts/dso_ci_review matches=0

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
DSO-Story: 3ebb-fa93-f980-40de
…284b)

The dso_ci_review dispatcher appends a `fallback_hops` provenance key to the
result dict when a provider-fallback hop occurs (dispatch.py:817/852) — AFTER the
LLM output, so the schema-correction loop cannot strip it. The validator's
top-level allowlist omitted it, so any review that hopped providers failed schema
validation, exhausted the 2-attempt corrector, and blocked merge fail-closed with
0 content findings (the recurring "schema correction exhausted" failure; payload
captured on PR #684 via the 9d2c diagnosability logging).

Fix: add fallback_hops to optional_top in validate-review-output.sh. Per the
epic-588e panel, ALLOWLIST (not strip) — stripping unknown top-level keys would
mask a malformed/empty payload into a false pass.

RED->GREEN: test_code_review_dispatch_with_fallback_hops_passes; validator suite 96/96.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- runner.py: str()-coerce schema-error entries before join so a non-string entry
  (None/dict/int) cannot raise TypeError and crash the schema-exhaustion
  fail-closed branch. +regression test (non-string + truncation cases).
- merge-to-main-pr.sh: the PR mergeability poll now fails CLOSED when `gh pr view`
  exits non-zero (404 deleted PR / 401 revoked-or-expired token / API outage) and
  leaves mergeability unresolved, instead of silently proceeding as if mergeable.
  A legitimately-UNKNOWN-then-timeout with gh rc 0 keeps the prior behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@joeoakhart, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 47 minutes and 27 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4819fbdd-4db7-44ca-960f-f8dd130b9d52

📥 Commits

Reviewing files that changed from the base of the PR and between efe667c and beee290.

📒 Files selected for processing (15)
  • docs/adr/0020-two-tier-hardening-over-merge-queue.md
  • plugins/dso/.claude-plugin/plugin.json
  • plugins/dso/docs/contracts/review-tristate-lattice.md
  • plugins/dso/scripts/ci/review-coverage-invariant.sh
  • plugins/dso/scripts/dso_ci_review/runner.py
  • plugins/dso/scripts/lib/review-tristate-lib.sh
  • plugins/dso/scripts/llm-review-dispatch-or-skip.sh
  • plugins/dso/scripts/merge-to-main-pr.sh
  • plugins/dso/scripts/validate-review-output.sh
  • tests/hooks/test-validate-review-output.sh
  • tests/scripts/test-llm-review-dispatch-or-skip.sh
  • tests/scripts/test-merge-to-main-resume-inc008.sh
  • tests/scripts/test-review-coverage-invariant.sh
  • tests/scripts/test-review-tristate-lib.sh
  • tests/skills/dso_ci_review/test_runner_schema_exhaustion_stderr.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch story/588e-abec-221f-42c0/3ebb-resilience-dd123-7

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gitar-bot

gitar-bot Bot commented Jun 6, 2026

Copy link
Copy Markdown
CI failed: The CI build failed because the PR is targeting the 'main' branch directly, which violates the required 'staged-*' branch promotion policy.

Overview

One failure identified in the check-staged-head job. The build is correctly enforcing a branch naming convention to ensure the two-tier promotion workflow is followed.

Failures

Branch Naming Policy Violation (confidence: high)

  • Type: configuration
  • Affected jobs: 79846423777
  • Related to change: yes
  • Root cause: The source branch story/588e-abec-221f-42c0/3ebb-resilience-dd123-7 does not match the staged-* pattern required for PRs targeting main.
  • Suggested fix: Either retarget this PR to a staged-* branch, or merge this PR into a staged-* branch first and then open a follow-up PR into main.

Summary

  • Change-related failures: 1 (Branch naming policy enforcement).
  • Infrastructure/flaky failures: 0
  • Recommended action: Update the base branch of the PR to conform to the required promotion workflow as defined in the repository's CI configuration.
Code Review ✅ Approved

Deploys 3ebb resilience updates, including a fix for the live e1bf regression and improved schema error diagnostics. No issues found.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@JoeOakhartNava JoeOakhartNava merged commit 5452cad into main Jun 6, 2026
26 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants