Skip to content

feat: harden small PR timeout proof and truthful partial reviews#87

Open
keithah wants to merge 1 commit intoxbmc:mainfrom
keithah:m050-timeout-hardening-pr
Open

feat: harden small PR timeout proof and truthful partial reviews#87
keithah wants to merge 1 commit intoxbmc:mainfrom
keithah:m050-timeout-hardening-pr

Conversation

@keithah
Copy link
Copy Markdown
Member

@keithah keithah commented Apr 16, 2026

Summary

  • make timeout partial-review and Review Details output derive analyzed progress, findings, retry state, and phase timing from checkpoint plus visible evidence
  • extend verify:m048:s01, verify:m048:s02, and verify:m048:s03 plus the review-requested runbook with publication-aware timeout proof
  • harden deploy.sh / .dockerignore so ACR remote builds use a minimal staged context instead of large local workspace artifacts

Test Plan

  • bun test ./src/handlers/review.test.ts ./src/lib/review-utils.test.ts ./scripts/verify-m048-s01.test.ts ./scripts/verify-m048-s02.test.ts ./scripts/verify-m048-s03.test.ts && bun run tsc --noEmit

Copilot AI review requested due to automatic review settings April 16, 2026 23:03
@kodiai kodiai bot requested a review from a team April 16, 2026 23:03
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the “small PR timeout” behavior by making partial-review and Review Details output derive analyzed progress, findings, retry state, and phase timing from checkpoint + GitHub-visible evidence, and extends the M048 verifier/runbook surfaces to prove those guarantees. It also reduces ACR remote-build context size by staging a minimal build context in deploy.sh.

Changes:

  • Add timeout-aware “Review Details” rendering and publish it on timeout using checkpoint + extracted inline findings evidence.
  • Extend verify:m048:s01/s02/s03 and the runbook with outcome/timeout-class proof and deterministic timeout-surface fixtures.
  • Harden ACR remote builds by using a minimal staged context; expand .dockerignore to exclude additional local worktree artifacts.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/lib/review-utils.ts Adds timeoutProgress support to render timeout-truthful Review Details lines.
src/lib/review-utils.test.ts Tests that timeout progress lines render and generic totals are suppressed.
src/handlers/review.ts On timeout, derives analyzed progress from checkpoint + inline evidence, surfaces retry state, and publishes timeout Review Details accordingly.
src/handlers/review.test.ts Adds an integration-style timeout publication test covering analyzed progress + retry scheduling.
scripts/verify-m048-s01.ts Introduces outcome classification derived from phase evidence.
scripts/verify-m048-s01.test.ts Updates tests to assert new outcome fields and timeout-partial publication interpretation.
scripts/verify-m048-s02.ts Adds timeout-class comparison state and new failure status codes when timeouts persist/regress.
scripts/verify-m048-s02.test.ts Expands compare tests to cover timeout-class retirement/persistence/regression.
scripts/verify-m048-s03.ts Adds deterministic “timeout truth” fixture proofs and includes outcome in rendered live evidence output.
scripts/verify-m048-s03.test.ts Asserts timeout-surface fixture pass/fail behavior and updated report structure.
docs/runbooks/review-requested-debug.md Documents how to run M048 verifiers for timeout-truth proof and interpretation.
deploy.sh Stages a minimal ACR build context (avoids sending large local workspace artifacts).
.dockerignore Ignores additional local-workspace directories.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/handlers/review.ts
Comment on lines +3803 to +3808
const buildReviewDetailsBody = (timeoutProgress?: {
analyzedFiles: number;
totalFiles: number;
findingCount: number;
retryState: string;
}): string => {
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildReviewDetailsBody redefines an inline timeoutProgress shape that duplicates the exported TimeoutReviewDetailsProgress type from src/lib/review-utils.ts. This risks drift between the handler and formatter if the fields change. Import and use TimeoutReviewDetailsProgress (or re-export a handler-local type) so the compiler keeps them in sync.

Copilot uses AI. Check for mistakes.
Comment on lines +112 to +116
## M050 Timeout-Truth Verifier Surfaces

Use the M048 verifier family directly when you need a machine-checkable answer for the repaired small-PR timeout class.

### Local deterministic timeout-truth proof
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new runbook section header says “M050 Timeout-Truth Verifier Surfaces”, but the commands and narrative immediately below reference the “M048 verifier family” (verify:m048:s01/s02/s03). Rename the header to match the actual verifier suite to avoid operator confusion.

Copilot uses AI. Check for mistakes.
Comment on lines +116 to +163
export function deriveM048S01Outcome(evidence: PhaseTimingEvidence | null | undefined): M048S01Outcome {
const conclusion = evidence?.conclusion ?? null;
const published = evidence?.published ?? null;

if (!evidence) {
return {
class: "unknown",
conclusion,
published,
summary: "no correlated phase evidence available",
};
}

if (conclusion === "timeout_partial" || (conclusion === "timeout" && published === true)) {
return {
class: "timeout_partial",
conclusion,
published,
summary: "timeout_partial (visible partial output published)",
};
}

if (conclusion === "timeout") {
return {
class: "timeout",
conclusion,
published,
summary: "timeout (no visible output published)",
};
}

if (conclusion === "success") {
return {
class: "success",
conclusion,
published,
summary: published === true ? "success (published output)" : "success (no published output)",
};
}

return {
class: conclusion ? "failure" : "unknown",
conclusion,
published,
summary: conclusion
? `${conclusion} (${published === true ? "published output" : published === false ? "no published output" : "publication unknown"})`
: "no correlated phase evidence available",
};
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deriveM048S01Outcome() returns summary text “no correlated phase evidence available” when evidence exists but evidence.conclusion is null. Since PhaseTimingEvidence.conclusion is nullable, this case can happen without evidence being missing, and the message becomes misleading. Consider a dedicated branch for evidence && conclusion === null (and possibly published === null) with a summary like “phase evidence missing conclusion/publication fields”, and/or add an issue in buildPhaseTimingEvidence when those fields are absent.

Copilot uses AI. Check for mistakes.
@kodiai
Copy link
Copy Markdown

kodiai bot commented Apr 16, 2026

Review Details
  • Files reviewed: 13

  • Findings: 0 critical, 0 major, 0 medium, 0 minor

  • Lines changed: +1001 -126

  • Profile: strict (manual config)

  • Contributor experience: profile-backed (using linked contributor profile guidance)

  • Review completed: 2026-04-16T23:09:28.095Z

  • Total wall-clock: 5m 36s

  • Phase timings:

    • queue wait: 0ms
    • workspace preparation: 4.5s
    • retrieval/context assembly: 2.7s
    • executor handoff: 11s
    • remote runtime: 5m 16s
    • publication: 226ms (degraded: captured before publication completed)
  • Tokens: 162 in / 12,390 out | 0.7682

  • Keyword parsing:

    • conventional type: feat

Copy link
Copy Markdown

@kodiai kodiai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kodiai response

Decision: APPROVE
Issues: none

Evidence:

  • Review prompt covered 13 changed files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants