Skip to content

Post-script should guard CHANGES_REQUESTED dismissal against unchanged diffs #2999

Description

@fullsend-ai-retro

What happened

On PR #644, the review agent submitted CHANGES_REQUESTED verdicts on 3 separate runs (Jun 19, Jun 23, Jun 29) identifying build-breaking issues (missing go.sum entries for a v5→v6 Go module path change). On the 4th run (workflow run 28583483847, Jul 2), the agent approved the same semantic diff on a new force-push. The post-script then: (1) dismissed all 3 prior CHANGES_REQUESTED reviews, (2) minimized 3 stale review threads, (3) submitted an APPROVE review with an empty body, and (4) applied the ready-for-merge label. The PR's go.mod change was still broken — no go.sum updates were ever added. PRIOR_REVIEW_SHA was empty on all 4 runs, so the agent had no memory of its own prior rejections.

What could go better

The post-script trusts the agent's verdict unconditionally. When the agent approves, the post-script dismisses all prior rejections and labels the PR as merge-ready — even if the code diff hasn't meaningfully changed since those rejections. This creates a safety gap: LLM non-determinism can produce an approval that contradicts prior findings, and the post-script amplifies that error by clearing the rejection history and signaling the PR is safe to merge.

This is distinct from existing issues: #2746 covers the non-deterministic verdict (the agent's behavior), #1552/#2816 cover passing prior findings forward (the agent's context), but no existing issue addresses the post-script's dismissal behavior as a separate safety layer. The post-script is the last line of defense before ready-for-merge is applied.

Confidence: High that this is a real safety gap. The PR was autoclosed before any damage occurred, but if a human had been watching and trusted the ready-for-merge label, broken code could have merged. Moderate confidence that a diff-check in the post-script is the right fix — an alternative is to fix the upstream issues (#1552, #1031) so the agent never flip-flops in the first place, but defense-in-depth argues for both.

Proposed change

In the review post-script (the script that processes the agent's verdict and posts the GitHub review), add a guard before dismissing prior CHANGES_REQUESTED reviews:

  1. When the new verdict is APPROVE and there are prior CHANGES_REQUESTED reviews on the same PR:
    a. Compare the current PR diff against the diff at the time of the most recent CHANGES_REQUESTED review (using the commit SHAs from the review records).
    b. If the diff is semantically unchanged (identical file changes), do NOT dismiss the prior rejections. Instead, post the APPROVE as a COMMENT-type review with a note that prior findings may still apply, and do NOT apply ready-for-merge.
    c. If the diff has meaningfully changed, proceed with dismissal as normal.

  2. As a simpler first step: require the agent's APPROVE review body to be non-empty and to explicitly address or acknowledge prior findings when prior CHANGES_REQUESTED reviews exist. If the body is empty (as it was in this case), downgrade the verdict to COMMENT. This aligns with existing issue Review agent APPROVED GitHub review should include a brief body, not be empty #1046 (APPROVE should include a brief body) but adds the safety constraint.

Validation criteria

  1. Reproduce the scenario: a PR with a prior CHANGES_REQUESTED review from the review agent, followed by a force-push with an identical semantic diff, where the agent submits APPROVE. The post-script should NOT dismiss the prior rejection and should NOT apply ready-for-merge.
  2. Verify normal flow still works: a PR where the agent requests changes, the author fixes the code (different diff), and the agent approves — the post-script should dismiss the prior rejection and apply ready-for-merge as before.
  3. Over the next 10 review runs that follow a prior CHANGES_REQUESTED, verify that no ready-for-merge label is applied when the diff is unchanged.

Generated by retro agent from konflux-ci/build-service#644

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions