LLM Tree Reasoning ◦ Knowledge Graph Multi-Hop ◦ Pixel-Precise Citations ◦ Unmatched Performance
Quick Start • Features • Technical Approach • Docs • 中文
Many approaches have been proposed to go beyond naive chunk-and-embed RAG, but each has fundamental limitations:
| Approach | Strength | Limitation |
|---|---|---|
| Embedding-based (e.g. naive RAG) | Fast semantic search | Similarity ≠ relevance; misses exact-match and structural context |
| Graph-based (e.g. GraphRAG) | Cross-document entity linking | Concept skeleton without source-text evidence; extraction loses details |
| Hybrid graph (e.g. LightRAG) | Dual-level retrieval (local + global) | Answers synthesized from KG summaries, not grounded in original text; higher hallucination risk |
| Reasoning-based (e.g. PageIndex) | High single-doc accuracy | Query latency scales linearly with document count; not production-ready |
When a domain expert encounters a question, they don't scan every page — they instantly recall where relevant information lives, draw on their mental map of how concepts connect, then synthesize a grounded answer from multiple sources. ForgeRAG mirrors this workflow: BM25 + vector search surfaces candidate regions in milliseconds, a knowledge graph provides the conceptual connections across documents, and LLM tree navigation reasons over document structure to pinpoint the exact sections that matter — all fused into a single answer with traceable citations.
To handle multi-hop questions (e.g. "Which suppliers of Apple also supply Samsung?"), we introduce a knowledge graph path that extracts entities and relations at ingestion time, then runs dual-level retrieval at query time: local (query entities → neighborhood traversal) and global (keywords → fuzzy / cross-lingual entity match via name embeddings), plus relation-semantic search over relation-description embeddings. Inspired by LightRAG's context assembly, the KG path injects synthesized entity and relation descriptions directly into the generation prompt — giving the LLM a "distilled knowledge layer" on top of raw text chunks.
We evaluate against LightRAG using the UltraDomain benchmark methodology (LLM-as-judge pairwise comparison). Win rates shown as ForgeRAG% / LightRAG%.
🚧 More comprehensive benchmarks against additional RAG systems, domains, and metrics are in progress.
| Domain | Comprehensiveness | Diversity | Empowerment | Overall |
|---|---|---|---|---|
| Agriculture | 58.6 / 41.4 | 47.1 / 52.9 | 52.9 / 47.1 | 56.4 / 43.6 |
| CS | 55.6 / 44.4 | 48.4 / 51.6 | 54.0 / 46.0 | 54.8 / 45.2 |
| Legal | 57.0 / 43.0 | 46.5 / 53.5 | 53.5 / 46.5 | 55.6 / 44.4 |
| Mix | 56.3 / 43.7 | 47.8 / 52.2 | 54.3 / 45.7 | 55.1 / 44.9 |
Judge: qwen3-max · Reproduce
Note on Faithfulness: The UltraDomain benchmark evaluates Comprehensiveness, Diversity, and Empowerment — but not factual accuracy. ForgeRAG provides pixel-precise
[c_N]citations for every claim, enabling verification against source text. LightRAG synthesizes answers from knowledge graph summaries without traceable citations, which scores well on breadth but carries higher hallucination risk.
Compared to heavier platforms like RAGFlow, ForgeRAG focuses on core pipeline design — a lean retrieval-answering chain with composable building blocks.
🔍 Dual-reasoning retrieval · BM25 + vector pre-filter → LLM tree nav + KG, fused via RRF
📌 Pixel-precise citations · Every claim links to exact page + bounding box, click to highlight
🔗 Full retrieval tracing · Inspect path scores, expansion decisions, and merge logic per query
💬 Multi-turn conversations · Context-aware follow-ups with conversation history
📄 Multi-format ingestion · PDF, DOCX, PPTX, XLSX, HTML, Markdown, TXT
⚙️ YAML-first config · One file, one restart — no hidden runtime state
🎛️ Per-request overrides · Toggle retrieval paths / top-ks / rerank per query via QueryOverrides (great for SDK + A/B)
🏆 Outperforms LightRAG · 55.48% overall win rate on UltraDomain benchmark
📸 Screenshots
Chat · Structured answers with pixel-precise citations
Ingestion · Document processing pipeline with tree building
Knowledge Graph · Entity-relation visualization
- Python 3.10+
- Node.js 18+ (for building the frontend)
- An LLM API key (OpenAI, DeepSeek, or any LiteLLM-compatible provider)
- Recommended: 4+ CPU cores, 8GB+ RAM (16GB+ for large documents with KG extraction)
git clone https://github.com/deeplethe/ForgeRAG.git
cd ForgeRAG
# 1. Core Python dependencies (small — the heavy backend packages are
# installed lazily in step 3 based on what your config actually picks).
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
# 2. Frontend
cd web && npm install && npm run build && cd ..
# 3. Configure: interactive wizard generates forgerag.yaml AND auto-pip-installs
# the backend-specific deps your choices need (e.g. chromadb, neo4j, mineru).
# To re-sync deps after a manual yaml edit: python scripts/setup.py --sync-deps forgerag.yaml
python scripts/setup.py
# 4. Run — defaults to a single worker (safe with the wizard's default
# SQLite + ChromaDB-persistent + NetworkX backends).
python main.pyOpen http://localhost:8000 — the web UI is served automatically.
Note: Document ingestion involves heavy LLM calls (tree building, KG extraction, embedding). For a responsive UI under concurrent ingestion, scale to multiple workers — but
--workers >1requires multi-process-safe backends (PostgreSQL + Neo4j + a non-persistent ChromaDB / Qdrant / Milvus / Weaviate / pgvector). Starting with--workers >1against single-process backends (SQLite, NetworkX, persistent ChromaDB) exits with code 2 to avoid silent data corruption.
git clone https://github.com/deeplethe/ForgeRAG.git
cd ForgeRAG
python scripts/docker_setup.py # Interactive wizard: pick provider, set keys, done
docker compose up -d # PostgreSQL + pgvector + ForgeRAG, ready to goOpen http://localhost:8000. See Deployment Guide for details.
Tip: We strongly recommend enabling MinerU — it significantly improves document structure parsing accuracy, especially for PDFs with complex layouts, tables, and formulas. Enable it in the web UI settings after startup.
| Component | Options |
|---|---|
| PDF Parser | One explicit choice: pymupdf (fast, default) / mineru (layout-aware, tables/formulas) / mineru-vlm (vision-language for scanned & complex layouts) |
| Relational DB | SQLite (default), PostgreSQL, MySQL |
| Vector Store | ChromaDB (default), pgvector (PostgreSQL), Qdrant, Milvus, Weaviate |
| Blob Storage | Local filesystem (default), Amazon S3, Alibaba OSS |
| Graph Store | NetworkX in-memory (default), Neo4j |
| LLM / Embeddings | Any LiteLLM-supported provider: OpenAI, Azure, Anthropic, Ollama, DeepSeek, Cohere, etc. |
| Flag | Default | Description |
|---|---|---|
--config |
auto-detect | Path to forgerag.yaml |
--host |
0.0.0.0 |
Bind address (or $FORGERAG_HOST) |
--port |
8000 |
Bind port (or $FORGERAG_PORT) |
--reload |
off | Hot-reload for development |
--workers |
1 |
Uvicorn workers. Values > 1 require multi-process-safe backends (PostgreSQL + Neo4j + non-persistent vector store); startup exits 2 otherwise. |
The diagram above shows the complete data flow. For detailed pipeline documentation with per-node annotations, see Architecture Overview.
The REST API is available at /api/v1/. Interactive docs:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
Key endpoints:
| Endpoint | Description |
|---|---|
POST /api/v1/query |
Ask a question (streaming SSE or sync) — accepts path_filter + overrides for per-request tuning |
POST /api/v1/documents/upload-and-ingest |
Upload into a folder (multipart; folder_path form field) |
GET /api/v1/documents?path_filter=…&recursive=… |
List docs under a folder |
GET /api/v1/documents/{id}/tree |
Document hierarchical structure |
GET /api/v1/graph |
Knowledge graph visualization |
GET /api/v1/settings |
Read-only snapshot of effective cfg (yaml is authoritative) |
- Getting Started — Installation, first document, step-by-step guide
- Architecture Overview — How ingestion, retrieval, and answering pipelines work
- Configuration Reference — Every config option with defaults and examples
- API Reference — REST API endpoints, request/response formats, SSE streaming
- Deployment Guide — Docker deploy, production checklist, Nginx, Ollama
- Development Guide — Dev setup, testing, adding new backends
- Auth & Sessions — Single-admin password + SK tokens, web management UI, CLI playbook
ForgeRAG/
├── api/ # FastAPI routes and schemas
├── answering/ # Answer generation pipeline
├── config/ # Pydantic configuration models
├── embedder/ # Embedding backends (LiteLLM, sentence-transformers)
├── graph/ # Knowledge graph stores (NetworkX, Neo4j)
├── ingestion/ # Document ingestion pipeline + format conversion
├── parser/ # PDF parsing, chunking, tree building
├── persistence/ # Database layer (relational, vector, blob)
├── retrieval/ # Retrieval pipeline (BM25, vector, tree, KG, merge)
├── scripts/ # CLI utilities (setup wizard, Docker setup, batch ingest)
├── web/ # Vue 3 frontend
├── docs/ # Detailed documentation
├── main.py # Application entry point
└── forgerag.yaml # Your local config (git-ignored)
- 🧪 More benchmarks against additional RAG systems and domains
- 🔄 Scale to 1M+ documents · incremental indexing, async KG
- 🌐 Multi-language retrieval · cross-lingual query and document support
- 📦 Python SDK ·
pip install forgerag-sdk - 🛠️ Config panel hints & diagnostics · Missing provider warnings, validation feedback
- ⚡ Performance optimization · Faster ingestion, query caching, async embedding
We welcome contributions of all kinds — bug fixes, new features, documentation improvements, and more.
Please read our Contributing Guide before submitting a pull request.


