|
| 1 | +# Agent Experience Evaluation |
| 2 | + |
| 3 | +A GitHub Action that evaluates a repository's AI agent experience quality by scoring its CLAUDE.md, AGENTS.md, and other AI instruction files against a standardized rubric. |
| 4 | + |
| 5 | +## How it works |
| 6 | + |
| 7 | +1. **Discovers** AI instruction files (CLAUDE.md, AGENTS.md, .cursorrules, .cursor/**, etc.) |
| 8 | +2. **Evaluates** them using the Claude Code CLI against a 6-criteria, 100-point rubric |
| 9 | +3. **Validates** the structured JSON report |
| 10 | +4. **Uploads** the report as a GitHub Actions artifact for central collection |
| 11 | + |
| 12 | +## Usage |
| 13 | + |
| 14 | +```yaml |
| 15 | +# .github/workflows/agent-experience-eval.yml |
| 16 | +name: Agent Experience Evaluation |
| 17 | + |
| 18 | +on: |
| 19 | + schedule: |
| 20 | + - cron: '0 6 * * 1' # Weekly, Monday 6am UTC |
| 21 | + push: |
| 22 | + paths: |
| 23 | + - 'CLAUDE.md' |
| 24 | + - '**/CLAUDE.md' |
| 25 | + - 'AGENTS.md' |
| 26 | + - '**/AGENTS.md' |
| 27 | + - '.claude/**' |
| 28 | + - '**/.claude/**' |
| 29 | + - '.cursorrules' |
| 30 | + - '**/.cursorrules' |
| 31 | + - '.cursor/**' |
| 32 | + - '**/.cursor/**' |
| 33 | + workflow_dispatch: {} |
| 34 | + |
| 35 | +jobs: |
| 36 | + evaluate: |
| 37 | + runs-on: ubuntu-latest |
| 38 | + steps: |
| 39 | + - uses: actions/checkout@v4 |
| 40 | + |
| 41 | + - name: Install Claude Code |
| 42 | + run: npm install -g @anthropic-ai/claude-code |
| 43 | + |
| 44 | + - name: Evaluate Agent Experience |
| 45 | + uses: Automattic/action-agent-experience-eval@v1 |
| 46 | + with: |
| 47 | + anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} |
| 48 | +``` |
| 49 | +
|
| 50 | +## Inputs |
| 51 | +
|
| 52 | +| Input | Required | Default | Description | |
| 53 | +|-------|----------|---------|-------------| |
| 54 | +| `anthropic_api_key` | Yes | — | Anthropic API key for Claude Code CLI | |
| 55 | +| `max_turns` | No | `15` | Maximum turns for the CLI evaluation | |
| 56 | +| `output_path` | No | `agent-experience-eval.json` | Path for the evaluation JSON file | |
| 57 | +| `upload_artifact` | No | `true` | Upload the JSON as a GitHub Actions artifact | |
| 58 | +| `artifact_name` | No | `agent-experience-eval` | Name of the uploaded artifact | |
| 59 | +| `artifact_retention_days` | No | `30` | Days to retain the artifact | |
| 60 | + |
| 61 | +## Outputs |
| 62 | + |
| 63 | +| Output | Description | |
| 64 | +|--------|-------------| |
| 65 | +| `score` | Overall evaluation score (0-100) | |
| 66 | +| `grade` | Letter grade (A, B, C, D, F) | |
| 67 | +| `json_path` | Path to the evaluation JSON file | |
| 68 | +| `has_ai_files` | Whether AI instruction files were found | |
| 69 | + |
| 70 | +## Scoring Rubric |
| 71 | + |
| 72 | +| Criterion | Points | Description | |
| 73 | +|-----------|--------|-------------| |
| 74 | +| Commands/Workflows | 20 | Build, test, lint, deploy commands documented | |
| 75 | +| Architecture Clarity | 20 | Codebase map with directories, modules, data flow | |
| 76 | +| Non-obvious Patterns | 15 | Gotchas, quirks, workarounds, edge cases | |
| 77 | +| Conciseness | 15 | Dense, valuable content with no filler | |
| 78 | +| Currency | 15 | Commands work, file refs accurate, stack current | |
| 79 | +| Actionability | 15 | Copy-paste ready commands, concrete steps | |
| 80 | + |
| 81 | +**Grade scale:** A (90-100), B (70-89), C (50-69), D (30-49), F (0-29) |
| 82 | + |
| 83 | +## Artifact Schema |
| 84 | + |
| 85 | +The action produces a JSON artifact with this structure: |
| 86 | + |
| 87 | +```json |
| 88 | +{ |
| 89 | + "version": 1, |
| 90 | + "score": 72, |
| 91 | + "grade": "B", |
| 92 | + "files_found": ["CLAUDE.md", ".cursor/rules/testing.mdc"], |
| 93 | + "criteria": { |
| 94 | + "commands_workflows": { "score": 18, "max": 20, "notes": "..." }, |
| 95 | + "architecture_clarity": { "score": 15, "max": 20, "notes": "..." }, |
| 96 | + "non_obvious_patterns": { "score": 12, "max": 15, "notes": "..." }, |
| 97 | + "conciseness": { "score": 10, "max": 15, "notes": "..." }, |
| 98 | + "currency": { "score": 8, "max": 15, "notes": "..." }, |
| 99 | + "actionability": { "score": 9, "max": 15, "notes": "..." } |
| 100 | + }, |
| 101 | + "issues": ["..."], |
| 102 | + "recommendations": ["..."] |
| 103 | +} |
| 104 | +``` |
| 105 | + |
| 106 | +## AI Files Detected |
| 107 | + |
| 108 | +The action searches for: |
| 109 | + |
| 110 | +- `CLAUDE.md`, `AGENTS.md`, `AGENTS.override.md` (root and subdirectories) |
| 111 | +- `.cursorrules`, `.windsurfrules`, `.aider.conf.yml`, `.codeiumrc` |
| 112 | +- `.github/copilot-instructions.md` |
| 113 | +- `.claude/**`, `.cursor/**`, `.codex/**` directories |
| 114 | + |
| 115 | +## Prerequisites |
| 116 | + |
| 117 | +The Claude Code CLI (`@anthropic-ai/claude-code`) must be installed on the runner before this action runs. See the usage example above. |
0 commit comments