Skip to content

Review agent: detect inconsistent config between sibling CI workflow jobs #2646

Description

@fullsend-ai-retro

What happened

In PR #2063, a human author added a new test-sandbox-darwin job to .github/workflows/lint.yml. The existing test job in the same file blanked GH_TOKEN and GITHUB_TOKEN for security isolation and used actions/checkout@v7 + actions/setup-go@v6. The new job omitted the token blanking and used older action versions (checkout@v6, setup-go@v5).

The review bot (fullsend-ai-review[bot]) ran 5+ times between Jun 9–25 and flagged protected-path, over-broad test scope, and minor code issues — but never flagged the missing token isolation or stale action versions. These were caught 13 days later by a human-run "Review Squad" (5 agents) on Jun 22, which flagged them as HIGH (stale versions, consensus 4/5 agents) and MEDIUM (missing token isolation, consensus 3/5 agents). The author then fixed both issues before merge.

What could go better

The review bot should detect when a new CI job is added to a workflow file that already contains similar jobs, and compare the new job's configuration against its siblings for missing patterns. This is a structural consistency check that requires no external knowledge — the review agent already reads the full diff and surrounding file context.

Specifically:

  1. Security env blanking: If existing jobs in the same workflow explicitly blank sensitive tokens (GH_TOKEN: '', GITHUB_TOKEN: ''), a new job that omits this is likely a security oversight.
  2. Action version consistency: If existing jobs use actions/checkout@v7 (SHA-pinned), a new job using actions/checkout@v6 is inconsistent and likely a copy-paste from outdated source.

I'm fairly confident this is a real gap (not a fluke) because both findings are straightforward pattern-matching against sibling jobs in the same file. The bot had all the information it needed — the diff included the new job, and the surrounding file context included the existing test job with the correct patterns. The human Review Squad caught both with high consensus (3–4 out of 5 agents), suggesting these are not subjective calls.

Uncertainty: I don't know the review agent's current prompt or whether it already has CI-specific heuristics. If it does, the issue may be prompt prioritization rather than missing capability.

Proposed change

Add a review heuristic (in the review agent's CI/workflow analysis path) that, when a PR adds a new job to an existing GitHub Actions workflow file, compares the new job's configuration against sibling jobs for:

  1. Missing environment variable overrides — if sibling jobs set env keys to blank/restricted values, flag when the new job omits them.
  2. Divergent action versions — if sibling jobs pin a GitHub Action to version X (by tag or SHA), flag when the new job uses an older version of the same action.
  3. Missing permissions constraints — if sibling jobs specify permissions, flag when the new job omits the block.

This could be implemented as guidance in the review agent's system prompt or as a dedicated CI-review sub-check. The finding severity should be MEDIUM for version divergence and HIGH for missing security patterns (token blanking, permissions).

The specific file to modify depends on the review agent architecture — likely the review agent definition in agents/review.md or a CI-specific review sub-agent if one exists.

Validation criteria

On the next 5 PRs to fullsend-ai/fullsend that add new jobs to existing workflow files, the review agent should:

  1. Flag any missing env variable overrides that are present in sibling jobs (especially token blanking) — expect 0 false negatives on this class of finding.
  2. Flag action version inconsistencies between the new job and sibling jobs — expect at most 1 false negative out of 5.

A good test case would be to submit a PR that adds a new workflow job copying the pattern from PR #2063's original version (missing token blanking, older action versions) and verify the review agent catches both issues without human intervention.


Generated by retro agent from #2063

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions