Skip to content

a-oren/multi-agent-reviewer

Repository files navigation

Multi‑Agent PR Reviewer

An opinionated, LLM‑powered pull request triage and review service. It ingests GitHub PR webhooks, fetches the diff and changed files, then runs a small LangGraph workflow of focused “agents” to produce a concise, actionable review in Markdown.

  • Stack: FastAPI, LangChain, LangGraph, Groq (Llama 3.x), httpx
  • Agents: Router → Diff Analyst → Standards → Test → Reviewer
  • Output: A single structured Markdown review (currently printed to logs; posting back to GitHub can be added easily).

Features

  • Webhook endpoint for GitHub pull_request events
  • GitHub API integration to fetch unified diff and changed files (auth optional for public repos)
  • Multi‑agent workflow (LangGraph) specialized for:
    • Diff summarization and risk detection
    • Repository standards/guidelines checks
    • Test coverage and missing test suggestions
    • Final synthesized review with actionable items
  • Configurable limits via environment variables to keep within model context

How it works

  1. GitHub sends a pull_request webhook (opened, synchronize, or ready_for_review) to this service.
  2. The service fetches the PR diff and the list of changed files from the GitHub API (using GITHUB_TOKEN if provided).
  3. A LangGraph pipeline runs:
    • Router infers domains (backend/frontend/infra/docs)
    • Diff Analyst summarizes changes and risks
    • Standards checks against constants.REPO_GUIDELINES
    • Test Agent suggests missing tests
    • Reviewer synthesizes a final Markdown review
  4. The review is returned in the API response preview and logged. (Hooking this up to comment on the PR is straightforward to add next.)

Repository layout

  • server.py – FastAPI app exposing GET / and POST /webhook/github
  • review_core.py – Multi‑agent orchestration and public API (run_pr_workflow, generate_pr_review)
  • agents/ – Individual agents (router.py, diff_analyst.py, standards.py, test_agent.py, reviewer.py, types.py, utils.py)
  • llm_client.py – Groq client setup and _call_llm helper
  • constants.py – Repository guidelines (REPO_GUIDELINES) used by the Standards Agent
  • text_utils.py – Helpers to truncate large inputs
  • requirements.txt – Python dependencies

Prerequisites

  • Python 3.10+
  • A Groq API key (required)
  • Optionally a GitHub token (recommended for higher rate limits and private repos)

Installation

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Create a .env file in the project root:

cp .env.example .env  # if you create one; otherwise make it manually

Minimum required entries:

GROQ_API_KEY=your_groq_api_key_here
# Optional overrides:
GROQ_MODEL=llama-3.1-8b-instant
GITHUB_TOKEN=ghp_...           # optional but recommended

Configuration

  • LLM:
    • GROQ_API_KEY (required)
    • GROQ_MODEL (default: llama-3.1-8b-instant)
  • GitHub:
    • GITHUB_TOKEN (optional) – used for diff/files fetching; unauthenticated fallback is attempted for public repos
  • Truncation / sizing knobs:
    • MAX_DIFF_CHARS (default: 20000) – trim incoming PR diff before the workflow
    • MAX_CHANGED_FILES (default: 200) – limit number of file paths passed to agents
    • AGENT_DIFF_MAX_CHARS – per‑agent diff trim (agents use their own sensible defaults)
    • REVIEWER_SECTION_MAX_CHARS – per‑section cap inside Reviewer

You can also customize repository guidelines by editing constants.py (REPO_GUIDELINES).

Running the server

uvicorn server:app --host 0.0.0.0 --port 8000 --reload

Health check:

curl http://localhost:8000/

Expect:

{"status":"ok","message":"PR triage assistant is running"}

GitHub webhook setup

  1. In your GitHub repo: Settings → Webhooks → Add webhook
  2. Payload URL: https://YOUR_DOMAIN/webhook/github (or your ngrok URL)
  3. Content type: application/json
  4. Secret: (not required by this server as-is; you may add verification later)
  5. Events: “Let me select individual events” → enable only “Pull requests”
  6. Active: checked

The server currently processes only these actions: opened, synchronize, ready_for_review.

Local tunneling (optional)

If you prefer to receive real GitHub webhooks locally, you can expose your machine using a tunnel. This repo includes ngrok binaries for convenience.

./ngrok http 8000
# copy the https URL it prints, e.g. https://abcd-12-34-56-78.ngrok-free.app
# set your GitHub webhook Payload URL to: https://<ngrok-host>/webhook/github

Testing the webhook locally (manual)

You can simulate a minimal pull_request event to check request handling (note this won’t fetch a real diff without valid repo info):

curl -X POST http://localhost:8000/webhook/github \
  -H "Content-Type: application/json" \
  -H "X-GitHub-Event: pull_request" \
  -d '{
    "action": "opened",
    "pull_request": {
      "title": "Sample PR",
      "body": "Adds a new feature",
      "number": 1,
      "url": "https://api.github.com/repos/owner/repo/pulls/1",
      "diff_url": "https://github.com/owner/repo/pull/1.diff"
    },
    "repository": {
      "name": "repo",
      "owner": { "login": "owner" }
    }
  }'

If the GitHub endpoints are reachable and the PR exists (and access is permitted), the server will fetch the diff and files, run the workflow, and log the review.

Programmatic usage

Use generate_pr_review directly if you already have the diff and file list:

from review_core import generate_pr_review

pr_title = "Improve retry logic"
pr_body = "Refactors client and increases resilience; no API changes."
diff_text = "... unified diff ..."
changed_files = ["src/client.py", "tests/test_client.py"]

review_markdown = generate_pr_review(pr_title, pr_body, diff_text, changed_files)
print(review_markdown)

Or run the full graph via run_pr_workflow (same signature; trims inputs and executes the agent pipeline).

Extending

  • Post the review back to GitHub:
    • After run_pr_workflow, call the GitHub Issues/PRs Comments API to create a comment with the generated Markdown.
  • Add/modify agents:
    • Add a new agent in agents/ and wire it into the graph in review_core.py.
  • Customize standards:
    • Edit constants.py (REPO_GUIDELINES) to reflect your org’s policies.

Troubleshooting

  • “GROQ_API_KEY is not set” at startup
    • Create .env with GROQ_API_KEY=... or export it in your shell
  • Receiving 401/403 on GitHub API calls
    • Ensure GITHUB_TOKEN has repo read access, or test with a public repo
  • Large PRs get truncated
    • Increase MAX_DIFF_CHARS / MAX_CHANGED_FILES or the agent‑specific caps, mindful of model context limits
  • Empty review or very short output
    • Check logs for LLM errors/rate limits; reduce input size or retry

Security & privacy

  • Be mindful that PR contents will be sent to the LLM provider (Groq). Review your data handling and provider policies before enabling on private code.
  • If needed, route through a proxy or implement redaction.

License

Apache-2.0 (or update to your chosen license).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages