Index your entire GitHub org β Ask questions about your code β Get AI-powered answers with source citations
Quick Start β’ Features β’ Architecture β’ Copilot Extension β’ Configuration β’ Contributing
GitSage is a self-hosted RAG (Retrieval-Augmented Generation) bot that indexes your GitHub organisation's repositories and lets you chat with your codebase. It understands your code, READMEs, issues, and development patterns β and cites its sources.
Works as a GitHub Copilot Extension β type @gitsage in Copilot Chat and ask anything about your org's code.
You: @gitsage how does authentication work in our services?
GitSage: Based on the codebase, authentication is handled by the `auth-service`
repository using JWT tokens...
π auth-service/src/main/java/com/example/AuthController.java
π auth-service/src/main/java/com/example/JwtTokenProvider.java
The flow is: Login β Validate credentials β Issue JWT β Store in
HTTP-only cookie β Verify on subsequent requests via JwtAuthFilter...
| Feature | Description |
|---|---|
| π Full Org Indexing | Crawls all repos β READMEs, source code, issues |
| π§ RAG-Powered Chat | Answers grounded in your actual code, not hallucinations |
| π€ Copilot Extension | @gitsage in GitHub Copilot Chat (VS Code, JetBrains, github.com) |
| π‘ Streaming Responses | Real-time SSE streaming for both REST API and Copilot |
| π Incremental Indexing | Only re-indexes changed files (content hash tracking) |
| β° Scheduled Re-indexing | Configurable cron-based automatic updates |
| π pgvector Storage | HNSW-indexed vectors in PostgreSQL β no extra infra |
| π³ One-Command Setup | docker compose up and you're running |
| π Signature Verification | Cryptographic webhook verification for Copilot requests |
| π REST API | Full HTTP API for chat, indexing, and status |
graph LR
subgraph "Your Team"
A[π©βπ» Developer] -->|"@gitsage"| B[GitHub Copilot]
A -->|curl/UI| C[REST API]
end
subgraph "GitSage"
B -->|SSE| D["π§ Copilot Extension<br/>/copilot"]
C --> E["π¬ Chat API<br/>/api/chat"]
D --> F[RAG Engine]
E --> F
F --> G["π Retrieval<br/>Similarity Search"]
F --> H["π€ LLM<br/>GPT-4o"]
I["π₯ Indexer"] --> J["βοΈ Chunker"]
J --> K["π Embeddings"]
K --> L
G --> L[("π PostgreSQL<br/>+ pgvector")]
end
subgraph "External"
I -->|GitHub API| M["π¦ Your Repos"]
K -->|API| N["OpenAI"]
H -->|API| N
end
style D fill:#6f42c1,color:#fff
style F fill:#0969da,color:#fff
style L fill:#336791,color:#fff
π See docs/architecture.md for detailed sequence diagrams and design decisions.
- Docker & Docker Compose
- GitHub Personal Access Token (create one with
reporead access) - OpenAI API key (get one)
git clone https://github.com/open-ai-school/gitsage.git
cd gitsage
# Create your environment file
cat > .env << EOF
GITHUB_TOKEN=ghp_your_token_here
GITHUB_ORG=your-org-name
OPENAI_API_KEY=sk-your-key-here
EOFdocker compose -f docker/docker-compose.yml --env-file .env up -dThat's it. GitSage is running at http://localhost:8080.
# Trigger initial indexing
curl -X POST http://localhost:8080/api/index
# Check progress
curl http://localhost:8080/api/index/status# Chat with your codebase
curl -X POST http://localhost:8080/api/chat \
-H "Content-Type: application/json" \
-d '{"question": "How is error handling implemented across our services?"}'# Real-time streaming
curl -N -X POST http://localhost:8080/api/chat/stream \
-H "Content-Type: application/json" \
-d '{"question": "What design patterns are used in the codebase?"}'The killer feature β use GitSage directly inside GitHub Copilot Chat.
- Register a GitHub App with Copilot Extension support
- Point the webhook URL to
https://your-domain.com/copilot - Install the app on your organisation
π Full setup guide: docs/copilot-extension-setup.md
Once installed, any developer in your org can:
@gitsage what does the payment service do?
@gitsage show me how we handle database migrations
@gitsage which repos use Spring Security?
@gitsage explain the CI/CD pipeline in the platform repo
GitSage is configured via environment variables:
| Variable | Required | Description | Default |
|---|---|---|---|
GITHUB_TOKEN |
β | GitHub PAT (read-only) | β |
GITHUB_ORG |
β | GitHub org to index | β |
OPENAI_API_KEY |
β | OpenAI API key | β |
POSTGRES_HOST |
β | Database host | localhost |
POSTGRES_PORT |
β | Database port | 5432 |
π Full configuration reference: docs/configuration.md
| Endpoint | Method | Description |
|---|---|---|
/api/chat |
POST |
Chat with your codebase (JSON response) |
/api/chat/stream |
POST |
Streaming chat (SSE) |
/api/index |
POST |
Trigger full org indexing |
/api/index/{repo} |
POST |
Index a single repository |
/api/index/status |
GET |
Get indexing status |
/copilot |
POST |
Copilot Extension endpoint (SSE) |
/health |
GET |
Health check |
# Start PostgreSQL
docker compose -f docker/docker-compose.yml up -d postgres
# Set environment variables
export GITHUB_TOKEN=ghp_xxx
export GITHUB_ORG=your-org
export OPENAI_API_KEY=sk-xxx
# Run tests
./gradlew test
# Run locally
./gradlew run- Ollama support β local LLM without API keys
- Web UI β browser-based chat interface
- Multi-org support β index multiple organisations
- GitHub Discussions β index discussion threads
- PR review context β understand review comments
- GraalVM native image β instant startup, minimal memory
- Slack/Teams integration β chat from your team channels
Contributions are welcome! Please read the Contributing Guide before submitting a PR.
MIT β see LICENSE for details.
Built with β Java 21 β’ π§© Micronaut 4 β’ π¦ LangChain4j β’ π PostgreSQL + pgvector
β Star this repo if GitSage helps your team understand their codebase better!