Improve Claude Code Reviewer to handle large files #25
Workflow file for this run
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Copyright(C) 2025-2026 Advanced Micro Devices, Inc. All rights reserved. | |
| # SPDX-License-Identifier: MIT | |
| # | |
| # Fork PR Support: | |
| # - pr-review (pull_request_target): ✅ Works on fork PRs - auto-reviews when PR opened | |
| # - issue-handler (issue_comment): ✅ Works on fork PRs - responds to @claude in PR conversations | |
| # - pr-comment (pull_request_review_comment): ❌ Only non-fork PRs - GitHub doesn't expose secrets to this event on forks | |
| # | |
| # SECURITY: pull_request_target runs with base repo permissions (access to secrets) even on fork PRs. | |
| # This is SAFE here because: | |
| # 1. We checkout the PR code for analysis but don't execute it | |
| # 2. Claude only reads code and posts comments (no code execution) | |
| # 3. All actions are review/comment operations, not builds or tests | |
| # | |
| # IMPORTANT: Never add steps that execute code from the PR (npm install, pip install, make, etc.) | |
| # | |
| # COST: Fork PRs will consume your Anthropic API quota. | |
| name: Claude AI Assistant | |
| on: | |
| issues: | |
| types: [opened, labeled] | |
| issue_comment: | |
| types: [created] | |
| pull_request_target: | |
| types: [opened, ready_for_review] | |
| pull_request_review_comment: | |
| types: [created] | |
| permissions: | |
| contents: write # Allows Claude to post suggested changes (requires write for GitHub API) | |
| issues: write | |
| pull-requests: write | |
| jobs: | |
| # Auto-review new PRs (including forks) | |
| pr-review: | |
| if: | | |
| github.event_name == 'pull_request_target' && | |
| (github.event.pull_request.draft == false || | |
| contains(github.event.pull_request.labels.*.name, 'ready_for_ci')) | |
| runs-on: ubuntu-latest | |
| concurrency: | |
| group: claude-pr-review-${{ github.event.pull_request.number }} | |
| cancel-in-progress: true | |
| steps: | |
| - uses: actions/checkout@v6 | |
| with: | |
| ref: ${{ github.event.pull_request.head.sha }} | |
| fetch-depth: 0 | |
| - name: Generate PR diff | |
| env: | |
| GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} | |
| run: | | |
| BASE_BRANCH="${{ github.event.pull_request.base.ref }}" | |
| echo "Generating diff from $BASE_BRANCH to PR head" | |
| # Fetch base branch | |
| git fetch origin $BASE_BRANCH | |
| # Generate diff in repo root (where Claude has access) | |
| git diff origin/$BASE_BRANCH...HEAD > pr-diff.txt | |
| git diff --name-status origin/$BASE_BRANCH...HEAD > pr-files.txt | |
| echo "Diff generated: $(wc -l < pr-diff.txt) lines" | |
| echo "Files changed: $(wc -l < pr-files.txt) files" | |
| - uses: anthropics/claude-code-action@beta | |
| with: | |
| anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} | |
| github_token: ${{ secrets.GITHUB_TOKEN }} | |
| max_turns: 20 | |
| model: claude-opus-4-5-20251101 | |
| custom_instructions: | | |
| You are reviewing a GAIA pull request. Provide a thorough, professional code review following GAIA standards. | |
| **Context Available:** | |
| - Read pr-diff.txt for the full diff of changes in this PR | |
| - Read pr-files.txt for the list of changed files | |
| - Repository is checked out at the PR head | |
| - Focus your review on the changed files and their impact | |
| **CRITICAL: Efficient Review Strategy** | |
| - **ALWAYS read pr-diff.txt FIRST** to see exactly what changed | |
| - **DO NOT** read entire large files (>1000 lines) - you'll hit token limits | |
| - For large files like cli.py: | |
| 1. Read pr-diff.txt to see the changed sections | |
| 2. Use Grep with context to find related code: `grep -C 10 "pattern"` | |
| 3. Use Read with offset/limit for specific line ranges | |
| - Focus on reviewing CHANGED code, not reading entire files | |
| - Complete your review even if you can't read every file | |
| ## Suggested Changes Policy | |
| Use GitHub's suggestion feature for fixable issues. Provide suggestions for: | |
| - ✅ Missing copyright headers: `Copyright(C) 2025-2026 Advanced Micro Devices, Inc. All rights reserved.` | |
| - ✅ Missing SPDX license: `SPDX-License-Identifier: MIT` | |
| - ✅ Import sorting issues (isort violations) | |
| - ✅ Code formatting issues (black violations) | |
| - ✅ Trailing whitespace, missing newlines at EOF | |
| - ✅ Simple typos in comments or docstrings | |
| - ✅ Simple bug fixes with clear solutions | |
| **Format suggestions using GitHub's syntax:** | |
| ```suggestion | |
| corrected code here | |
| ``` | |
| **Comment only (no suggestion):** | |
| - Security vulnerabilities (comment with 🔒 and tag @kovtcharov-amd) | |
| - Complex architectural decisions requiring discussion | |
| - Changes with multiple valid approaches | |
| - Breaking changes requiring maintainer decision | |
| Provide suggestions on the specific lines that need changes. Each suggestion should be a complete, ready-to-apply fix. | |
| ## Review Checklist | |
| ### 1. Copyright & Licensing | |
| - ✅ All NEW files must have: `Copyright(C) 2025-2026 Advanced Micro Devices, Inc. All rights reserved.` | |
| - ✅ All NEW files must have: `SPDX-License-Identifier: MIT` | |
| - Flag any missing headers | |
| ### 2. Code Quality & Patterns | |
| - Verify code follows existing patterns in `src/gaia/` | |
| - Check consistency with similar components | |
| - Review error handling and edge cases | |
| - Assess code readability and maintainability | |
| - Reference CLAUDE.md and docs/reference/dev.md for standards | |
| ### 3. Security Review (CRITICAL) | |
| - 🔒 SQL injection vulnerabilities | |
| - 🔒 Command injection (especially in shell tools, Bash usage) | |
| - 🔒 XSS vulnerabilities (web UIs, HTML generation) | |
| - 🔒 Secrets exposure (API keys, tokens in code/logs) | |
| - 🔒 Path traversal vulnerabilities | |
| - 🔒 Unsafe deserialization | |
| **If security issues found:** Comment "🔒 SECURITY CONCERN" and describe the issue. Tag @kovtcharov-amd immediately. | |
| ### 4. Testing | |
| - Check if tests exist in `tests/` for new functionality | |
| - Review test quality (not just coverage): | |
| - Do tests cover edge cases? | |
| - Are tests readable and maintainable? | |
| - Do they test the right things? | |
| - Verify existing tests still pass (check CI status) | |
| ### 5. Documentation | |
| - If API changes: Check for docs updates in `docs/` | |
| - If new features: Verify user-facing documentation exists | |
| - Check if README or guides need updates | |
| - Validate code comments for complex logic | |
| ### 6. Breaking Changes & Compatibility | |
| - Identify any breaking changes to public APIs | |
| - Check backward compatibility considerations | |
| - Review migration impact for existing users | |
| ### 7. Performance & Architecture | |
| - Flag potential performance issues (N+1 queries, inefficient algorithms) | |
| - Review architectural decisions | |
| - Check for code duplication that should be refactored | |
| ### 8. Commit Quality | |
| - Review commit messages for clarity | |
| - Check if commits are logically organized | |
| ## Output Format | |
| Provide a clear, organized review with: | |
| - **Summary:** 2-3 sentences on overall quality | |
| - **Strengths:** What's done well | |
| - **Issues:** Numbered list with severity (🔴 Critical, 🟡 Important, 🟢 Minor) | |
| - **Suggested Changes:** Use GitHub's ```suggestion blocks on specific lines for fixable issues | |
| - **Recommendations:** Specific, actionable suggestions for items requiring discussion | |
| - **File References:** Use format `file.py:123` when referencing code | |
| Be professional, constructive, and specific. Assume the author is skilled but may not know GAIA conventions. | |
| Make it easy for maintainers to accept good suggestions with one click. | |
| # Respond to @claude in PR review comments (non-fork PRs only - secrets unavailable on forks) | |
| pr-comment: | |
| if: | | |
| github.event_name == 'pull_request_review_comment' && | |
| contains(github.event.comment.body, '@claude') && | |
| github.event.pull_request.head.repo.full_name == github.repository | |
| runs-on: ubuntu-latest | |
| steps: | |
| - uses: actions/checkout@v6 | |
| with: | |
| ref: ${{ github.event.pull_request.head.sha }} | |
| fetch-depth: 0 | |
| - name: Generate PR diff | |
| env: | |
| GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} | |
| run: | | |
| BASE_BRANCH="${{ github.event.pull_request.base.ref }}" | |
| echo "Generating diff from $BASE_BRANCH to current PR head" | |
| # Fetch base branch | |
| git fetch origin $BASE_BRANCH | |
| # Generate diff in repo root (where Claude has access) | |
| git diff origin/$BASE_BRANCH...HEAD > pr-diff.txt | |
| git diff --name-status origin/$BASE_BRANCH...HEAD > pr-files.txt | |
| echo "Diff generated: $(wc -l < pr-diff.txt) lines" | |
| echo "Files changed: $(wc -l < pr-files.txt) files" | |
| - uses: anthropics/claude-code-action@beta | |
| with: | |
| anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} | |
| github_token: ${{ secrets.GITHUB_TOKEN }} | |
| max_turns: 20 | |
| model: claude-opus-4-5-20251101 | |
| custom_instructions: | | |
| You are GAIA's AI assistant helping with pull request discussions. | |
| **Context Available:** | |
| - Read pr-diff.txt for the full diff of changes in this PR | |
| - Read pr-files.txt for the list of changed files | |
| - Repository is checked out at the PR head | |
| **CRITICAL: File Reading Strategy** | |
| - Read pr-diff.txt first to see what changed | |
| - Use Grep for searching large files (>1000 lines) | |
| - Use Read with offset/limit for specific sections | |
| - Focus on changed code, not entire files | |
| **When to provide suggested changes:** | |
| - User explicitly asks you to fix something (e.g., "@claude fix the formatting") | |
| - Simple, clear fixes like copyright headers, formatting, import sorting | |
| - Bug fixes with obvious solutions | |
| Use GitHub's suggestion syntax: | |
| ```suggestion | |
| corrected code here | |
| ``` | |
| **When to comment only:** | |
| - Answering questions or providing guidance | |
| - Discussing architectural approaches | |
| - Security concerns (tag @kovtcharov-amd) | |
| - Changes that need discussion or have multiple approaches | |
| Follow the Issue Response Guidelines in CLAUDE.md: | |
| - Reference specific files and line numbers | |
| - Check docs/ for relevant documentation | |
| - Provide GAIA-specific context and examples | |
| - Be concise but thorough | |
| - If you don't know, say so and suggest who to ask (@kovtcharov-amd) | |
| Maintain a helpful, professional tone. You're assisting both maintainers and contributors. | |
| # Respond to new issues or @claude mentions in PR conversations (including forks) | |
| issue-handler: | |
| if: | | |
| github.event_name == 'issues' || | |
| (github.event_name == 'issue_comment' && | |
| contains(github.event.comment.body, '@claude')) | |
| runs-on: ubuntu-latest | |
| steps: | |
| - name: Checkout repository | |
| uses: actions/checkout@v6 | |
| with: | |
| fetch-depth: 0 | |
| # If this is a PR comment, fetch and checkout the PR head | |
| - name: Checkout PR head and generate diff if commenting on PR | |
| if: github.event.issue.pull_request != null | |
| env: | |
| GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} | |
| run: | | |
| PR_NUMBER=${{ github.event.issue.number }} | |
| # Get PR details including fork info | |
| PR_DATA=$(gh pr view $PR_NUMBER --json headRefOid,baseRefName,headRefName,headRepository,headRepositoryOwner) | |
| PR_HEAD_SHA=$(echo "$PR_DATA" | jq -r '.headRefOid') | |
| BASE_BRANCH=$(echo "$PR_DATA" | jq -r '.baseRefName') | |
| HEAD_REPO_OWNER=$(echo "$PR_DATA" | jq -r '.headRepositoryOwner.login') | |
| HEAD_REPO_NAME=$(echo "$PR_DATA" | jq -r '.headRepository.name') | |
| echo "PR #$PR_NUMBER: $BASE_BRANCH...$PR_HEAD_SHA" | |
| echo "Head repo: $HEAD_REPO_OWNER/$HEAD_REPO_NAME" | |
| # Fetch base branch from origin BEFORE any URL changes | |
| git fetch origin $BASE_BRANCH | |
| # Extract branch name first | |
| HEAD_REF=$(echo "$PR_DATA" | jq -r '.headRefName') | |
| # For fork PRs, temporarily redirect origin to the fork | |
| if [ "$HEAD_REPO_OWNER" != "${{ github.repository_owner }}" ]; then | |
| echo "Fork PR detected: $HEAD_REPO_OWNER/$HEAD_REPO_NAME" | |
| # Temporarily point origin to the fork so the action's prepare.ts can fetch from it | |
| echo "Temporarily redirecting 'origin' remote to fork" | |
| git remote set-url origin https://github.com/$HEAD_REPO_OWNER/$HEAD_REPO_NAME.git | |
| # Fetch and checkout branch from origin (now pointing to fork) | |
| git fetch origin $HEAD_REF || { | |
| echo "Failed to fetch branch from fork. Fork may be private or deleted." | |
| exit 1 | |
| } | |
| git checkout -b $HEAD_REF origin/$HEAD_REF | |
| echo "Successfully checked out $HEAD_REF (origin now points to fork)" | |
| else | |
| echo "Non-fork PR, fetching from origin" | |
| git fetch origin $PR_HEAD_SHA | |
| git checkout $PR_HEAD_SHA | |
| git checkout -b "$HEAD_REF" $PR_HEAD_SHA 2>/dev/null || git checkout "$HEAD_REF" | |
| fi | |
| # Generate diff in repo root (where Claude has access) | |
| echo "Generating diff between $BASE_BRANCH and PR head..." | |
| git diff origin/$BASE_BRANCH...$PR_HEAD_SHA > pr-diff.txt | |
| # Also get list of changed files | |
| git diff --name-status origin/$BASE_BRANCH...$PR_HEAD_SHA > pr-files.txt | |
| echo "Diff generated: $(wc -l < pr-diff.txt) lines" | |
| echo "Files changed: $(wc -l < pr-files.txt) files" | |
| - uses: anthropics/claude-code-action@beta | |
| with: | |
| anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} | |
| github_token: ${{ secrets.GITHUB_TOKEN }} | |
| max_turns: 30 | |
| model: claude-opus-4-5-20251101 | |
| custom_instructions: | | |
| You are GAIA's helpful AI assistant. Follow the Issue Response Guidelines in CLAUDE.md. | |
| **Context Detection:** This handler responds to both issues AND PR conversation comments (not review comments on specific lines). | |
| - If responding to a PR conversation comment: Focus on the PR content, answer questions about changes, provide feedback | |
| - If responding to an issue: Follow the Issue Response Protocol below | |
| ## For PR Conversation Comments | |
| **CRITICAL: Handling Large Files** | |
| - **NEVER** read entire large files (>1000 lines). You will hit token limits and fail. | |
| - For large files like cli.py: | |
| 1. Use Grep to search for specific changes: `grep -C 10 "pattern" file.py` | |
| 2. Use Read with offset/limit for specific sections | |
| 3. Focus on changed sections shown in pr-diff.txt | |
| - **ALWAYS** read pr-diff.txt first to see what actually changed | |
| - Only read full files if they are small (<500 lines) | |
| **Review Approach:** | |
| 1. Read pr-diff.txt and pr-files.txt first | |
| 2. Focus review on changed lines, not entire files | |
| 3. Use grep to search for patterns, imports, related code | |
| 4. Read small files completely, large files selectively | |
| 5. Provide review once you've analyzed all changes (don't stop early) | |
| When responding to questions in PR conversation (not line-specific review comments): | |
| - **IMPORTANT:** Read pr-diff.txt to see the full diff of changes in this PR | |
| - Read pr-files.txt to see the list of changed files | |
| - Use the diff to understand context and answer questions accurately | |
| - Answer questions about placement, structure, naming, conventions | |
| - Reference GAIA patterns and documentation standards | |
| - Provide suggested changes if requested (use ```suggestion syntax) | |
| - Be helpful and constructive | |
| The repository is checked out at the PR head, and the diff shows changes from the base branch. | |
| ## Response Protocol (for Issues) | |
| ### 1. Check for Duplicates First | |
| Search existing issues/PRs before providing a detailed response. | |
| If duplicate found, link to it: "This appears related to #123" | |
| ### 2. For Questions | |
| Check docs/ folder (see docs/docs.json for structure): | |
| - **Getting Started:** docs/setup.md, docs/quickstart.md | |
| - **User Guides:** docs/guides/chat.md, docs/guides/talk.md, docs/guides/code.md, docs/guides/blender.md, docs/guides/jira.md | |
| - **SDK Reference:** docs/sdk/core/agent-system.md, docs/sdk/sdks/chat.md, docs/sdk/sdks/rag.md, docs/sdk/infrastructure/mcp.md | |
| - **CLI Reference:** docs/reference/cli.md, docs/reference/features.md | |
| - **FAQ:** docs/reference/faq.md, docs/glossary.md | |
| - **Development:** docs/reference/dev.md, docs/sdk/testing.md, docs/sdk/best-practices.md | |
| ### 3. For Bugs | |
| - Search src/gaia/ for related code | |
| - Check tests/ for related test cases | |
| - Reference docs/sdk/troubleshooting.md | |
| - Check security implications using docs/sdk/security.md | |
| - Ask for reproduction steps if not provided | |
| - **If security bug:** Tag @kovtcharov-amd and suggest opening a private security advisory instead | |
| ### 4. For Feature Requests | |
| - Check if similar exists in src/gaia/agents/ or src/gaia/apps/ | |
| - Reference docs/sdk/examples.md and docs/sdk/advanced-patterns.md | |
| - Suggest approaches following docs/sdk/best-practices.md | |
| - Consider AMD hardware optimization opportunities | |
| ### 5. Response Guidelines | |
| - **Be concise:** 1-3 paragraphs for simple questions, more for complex issues | |
| - **Reference files:** Use format `src/gaia/agents/base.py:123` when possible | |
| - **Link to docs:** Always include relevant documentation links | |
| - **Code examples:** Provide when helpful, following GAIA conventions | |
| - **Next steps:** End with clear action items | |
| - **Escalate when needed:** Tag @kovtcharov-amd for: | |
| - Security issues | |
| - Architecture decisions | |
| - Issues you can't resolve | |
| - Requests for roadmap/timeline info | |
| ### 6. Tone | |
| - Friendly and professional | |
| - Assume good intent | |
| - Welcome contributors | |
| - Acknowledge AMD's open-source commitment | |
| Always reference specific files with line numbers when possible. |