Skip to content

Improve agent skills: CI-first verification design, PR gates, and early-warning signals #7094

@jstirnaman

Description

@jstirnaman

Problem

When agents (Claude, Copilot, etc.) are asked to add tests or validation for a change, they default to patterns that already exist in the repo:

  • A Cypress spec gets added to cypress/e2e/content/ because that's where other specs live
  • A Vale rule gets added because Vale is the existing lint layer
  • A pre-commit hook gets added because Lefthook is already wired up

This produces local-only validation that runs on developer machines and blocks commits, but doesn't run as an independent PR gate. When the test is wrong, flaky, or skipped (--no-verify), regressions ship.

Recent example

PR for #7089 (Enterprise feedback button). The original fix targeted the wrong variable. A rendered-HTML visual check caught it — but only because a human spot-checked the PR preview. No automated gate existed that would have caught the same class of bug in CI.

Goals

Agents should design verification for:

  1. PR gates, not just local hooks — treat CI as the primary gate, local hooks as the early-warning layer. Local checks can be skipped (--no-verify); CI cannot.
  2. Early-warning signals in the dev loop — local hooks should give fast feedback, but never be the sole line of defense.
  3. Continuous improvement — when a bug class is discovered, the agent should propose a test that prevents the bug class, not just the specific bug. Structural refactors (e.g., data-driven config replacing conditional logic) should be preferred when they make the bug impossible.
  4. Minimum-viable verification — pick the cheapest reliable test for the assertion. Static grep for static HTML. Cypress for interactive JS. Unit tests for pure functions. Don't reach for the heavy tool out of habit.
  5. Awareness of CI infrastructure — agents should know which CI system runs (.github/workflows vs .circleci/config.yml), which jobs already exist, and which workflow is the right home for a new check. They should prefer extending an existing workflow over creating a new one.

Proposed skill additions

Create .claude/skills/ci-verification-design/SKILL.md covering:

Inventory step

Before adding any test, the agent must list existing CI workflows in .github/workflows/ and identify which ones run on PRs. Reference existing jobs as templates.

Test pyramid decision tree

  • Does the behavior involve JS execution or user interaction? → Cypress/Playwright
  • Does the behavior involve rendered HTML from static inputs? → Node script + rg/grep on public/
  • Does the behavior involve pure data transformations or validation? → Unit test or JSON Schema validation
  • Does the behavior involve the build process itself? → Shell script invoked by the build job

CI gating checklist

  • Does this run on every PR that touches the relevant files?
  • Does it fail-fast with a clear error message?
  • Is it annotated with actions/github-script or equivalent for inline PR annotations?
  • Does it have a local equivalent (Lefthook hook) for dev-loop speed?
  • Is the CI job named clearly so reviewers understand what it protects?

Refactor-before-test principle

If the bug was caused by a conditional or string-munge pattern, propose a data-driven refactor first. Tests should protect against regression of the refactored architecture, not the broken original.

Proposed workflow additions

  • Update .github/copilot-instructions.md / CLAUDE.md: when a task adds verification logic, agents must document both local and CI enforcement paths in the PR description.
  • New PR checklist item in .github/pull_request_template.md: "Verification added (CI gate / local hook / manual check) — which?"
  • Audit existing checks: identify which Lefthook-only checks could/should be promoted to PR gates (e.g., Vale runs in CI already; does prettier? does shellcheck? does the product config schema validation?)

Acceptance criteria

  • .claude/skills/ci-verification-design/SKILL.md created with decision tree and inventory guidance
  • CLAUDE.md / AGENTS.md references the new skill
  • PR template prompts contributors (human + agent) to declare verification layers
  • Audit report: list of local-only checks with promotion recommendations
  • Example: feedback-links check from Submit a docs issue button for 3 Enterprise goes to Support site #7089 follow-up is implemented as a CI PR gate, not a Lefthook-only hook

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:cicontinuous integration pipeline (verify, test, validate, publish)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions