Merge pull request #106 from NickBorgersProbably/fix/claude-credentia… #65

Workflow file for this run

.github/workflows/review-coverage-evaluator.yml at c5eae6e

	name: Review Coverage Evaluator

	# Post-merge analysis: Evaluates whether the review pipeline had adequate coverage
	# for the PR that was just merged. If a gap is found, creates a GitHub issue proposing
	# a new reviewer or changes to an existing one.
	#
	# DESIGN NOTES:
	# - Runs post-merge on main as a background task; never slows PR reviews
	# - Strong bias toward NO ACTION - adding review steps is expensive
	# - Analyzes both code changes AND agent review comments to find gaps
	# - Uses Opus for strong reasoning to avoid false positives
	# - Creates issues with "review-pipeline" label for trackability

	on:
	push:
	branches: [main]

	# Prevent duplicate evaluations for rapid merges
	concurrency:
	group: review-coverage-evaluator-${{ github.sha }}
	cancel-in-progress: false

	env:
	DEVCONTAINER_IMAGE: ghcr.io/nickborgersprobably/hide-my-list-devcontainer

	jobs:
	# Find the PR that was just merged from the push commit
	get-pr-context:
	runs-on: ubuntu-latest
	outputs:
	pr_number: ${{ steps.find-pr.outputs.pr_number }}
	steps:
	- name: Find merged PR from commit
	id: find-pr
	env:
	GH_TOKEN: ${{ github.token }}
	run: \|
	# Find the PR associated with this merge commit
	PR_NUMBER=$(gh api repos/${{ github.repository }}/commits/${{ github.sha }}/pulls \
	--jq '.[0].number // empty' 2>/dev/null \|\| echo "")

	if [ -z "$PR_NUMBER" ]; then
	echo "No PR found for commit ${{ github.sha }} - this may be a direct push"
	echo "pr_number=" >> $GITHUB_OUTPUT
	else
	echo "Found merged PR #$PR_NUMBER"
	echo "pr_number=$PR_NUMBER" >> $GITHUB_OUTPUT
	fi

	# Build and cache devcontainer image (same pattern as other workflows)
	build-devcontainer:
	runs-on: [self-hosted, homelab]
	needs: get-pr-context
	if: needs.get-pr-context.outputs.pr_number != ''
	permissions:
	contents: read
	packages: write

	steps:
	- name: Checkout repository
	uses: actions/checkout@v4

	- name: Log in to GHCR
	uses: docker/login-action@v3
	with:
	registry: ghcr.io
	username: ${{ github.actor }}
	password: ${{ secrets.GITHUB_TOKEN }}

	- name: Build and push devcontainer
	uses: devcontainers/ci@v0.3
	with:
	imageName: ${{ env.DEVCONTAINER_IMAGE }}
	cacheFrom: ${{ env.DEVCONTAINER_IMAGE }}
	push: always

	# Run Claude to evaluate review coverage
	evaluate-coverage:
	needs: [get-pr-context, build-devcontainer]
	if: needs.get-pr-context.outputs.pr_number != ''
	runs-on: [self-hosted, homelab]
	permissions:
	contents: read
	issues: write
	packages: read

	steps:
	- name: Checkout repository
	uses: actions/checkout@v4
	with:
	fetch-depth: 0

	- name: Log in to GHCR
	uses: docker/login-action@v3
	with:
	registry: ghcr.io
	username: ${{ github.actor }}
	password: ${{ secrets.GITHUB_TOKEN }}

	- name: Evaluate review coverage with Claude
	uses: devcontainers/ci@v0.3
	with:
	imageName: ${{ env.DEVCONTAINER_IMAGE }}
	cacheFrom: ${{ env.DEVCONTAINER_IMAGE }}
	push: never
	env: \|
	CLAUDE_CODE_OAUTH_TOKEN=${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
	GH_TOKEN=${{ secrets.WORKFLOW_PAT }}
	PR_NUMBER=${{ needs.get-pr-context.outputs.pr_number }}
	REPO=${{ github.repository }}
	runCmd: \|
	claude --print \
	--verbose \
	--output-format stream-json \
	--model opus \
	--dangerously-skip-permissions \
	--max-turns 150 \
	"You are a REVIEW PIPELINE COVERAGE EVALUATOR for ${REPO}.

	PR #${PR_NUMBER} has been merged to main. Your job is to analyze whether the
	existing review pipeline adequately covered this PR, or whether there is a
	meaningful gap that warrants proposing a new review step or changes to an
	existing one.

	THE EXISTING REVIEW PIPELINE (3 reviewers):
	1. Design Review - validates PR implements issue intent, reviews design quality, checks doc consistency
	2. Security & Infrastructure Review - script safety, credential handling, workflow permissions
	3. Psych Research Review - evaluates user-facing changes against ADHD research literature

	YOUR TASK:
	1. Read the full PR diff: \`gh pr diff ${PR_NUMBER}\`
	2. Read ALL comments on the PR (including agent review comments):
	\`gh pr view ${PR_NUMBER} --comments\`
	Also fetch review comments on specific lines:
	\`gh api repos/${REPO}/pulls/${PR_NUMBER}/comments --jq '.[] \| \"\(.user.login) on \(.path):\(.line):\n\(.body)\n---\"'\`
	3. Read the PR description: \`gh pr view ${PR_NUMBER}\`
	4. Analyze whether the 3 existing reviewers adequately covered the changes

	WHAT TO LOOK FOR:
	- Categories of issues that none of the 3 reviewers are equipped to catch
	- Patterns in agent comments suggesting a reviewer was out of its depth
	(e.g., a code reviewer trying to comment on security concerns it can't deeply analyze)
	- Recurring blind spots across multiple PRs (check recent closed PRs if helpful:
	\`gh pr list --state merged --limit 5 --json number,title\`)
	- Types of code changes that fall between reviewer specializations

	CRITICAL: STRONG BIAS TOWARD NO ACTION.
	Adding a new review step is EXPENSIVE:
	- It costs real money (LLM API calls) on every single PR
	- It adds latency to the review pipeline
	- It increases complexity and maintenance burden
	- It risks creating noise that causes developers to ignore reviews

	You should only propose a new reviewer or change if:
	- There is a CLEAR, REPEATED gap (not a one-off edge case)
	- The gap represents a category of bugs that could reach production
	- None of the existing 3 reviewers can reasonably be extended to cover it
	- The cost/benefit ratio clearly favors adding the step

	MOST OF THE TIME, the correct answer is: no gap found, no action needed.

	IF NO GAP IS FOUND (expected most of the time):
	Simply output: 'Review coverage evaluation complete for PR #${PR_NUMBER}. No gaps identified. The existing 3-reviewer pipeline adequately covered this PR.'
	Then exit. Do NOT create an issue.

	IF A GENUINE GAP IS FOUND:
	1. First, ensure the 'review-pipeline' label exists:
	gh label create \"review-pipeline\" --color \"d93f0b\" --description \"Review pipeline improvement proposals\" 2>/dev/null \|\| true
	2. Create a GitHub issue:
	gh issue create \\
	--title \"Review Pipeline Gap: <brief description of the gap>\" \\
	--assignee NickBorgers \\
	--label \"review-pipeline\" \\
	--body \"\$(cat <<'ISSUE_BODY'
	## Review Pipeline Coverage Gap

	Identified from: PR #${PR_NUMBER}

	### Gap Description
	[What category of issues is not being caught by the current 3-reviewer pipeline?]

	### Evidence
	[Specific examples from the PR diff and/or agent review comments that demonstrate the gap]

	### Proposal
	[Either: a new reviewer specification, OR changes to an existing reviewer's prompt]

	### Cost/Benefit Analysis
	- Cost: [Estimated additional time/money per PR]
	- Benefit: [What category of bugs this would catch]
	- Alternative considered: [Why extending an existing reviewer won't work]

	---
	Generated by Review Coverage Evaluator
	ISSUE_BODY
	)\"

	Remember: When in doubt, do NOT create an issue. False positives erode trust
	in the pipeline evaluation system." < /dev/null 2>&1 \| tee /tmp/coverage-evaluator-output.jsonl

	- name: Upload evaluator output
	uses: actions/upload-artifact@v4
	if: always()
	continue-on-error: true
	with:
	name: coverage-evaluator-output
	path: /tmp/coverage-evaluator-output.jsonl
	retention-days: 7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge pull request #106 from NickBorgersProbably/fix/claude-credentia… #65

Workflow file

Merge pull request #106 from NickBorgersProbably/fix/claude-credentia… #65

Uh oh!

Workflow file for this run