Production‑ready RAG app with a clean Streamlit UI, FastAPI backend, background ingestion, and hybrid retrieval (vector + keyword). Built to stay small, stable, and easy to ship.
Add a screenshot at docs/screenshot.png to showcase the UI:
- Streamlit UI with BYOK (bring your own key) support
- FastAPI service for collections, documents, jobs, and chat
- Async ingestion via RQ + Redis (sync fallback supported)
- Chroma for vector search (persistent local storage)
- SQLite metadata + FTS5 keyword search (hybrid retrieval)
- Deterministic agent flow (no LangChain/LangGraph dependency)
- Create
.env(see.env.example) - Run:
docker compose -f docker/docker-compose.yml up --buildOpen:
Backend:
cd services/api
python -m venv .venv && . .venv/bin/activate
pip install -r requirements.txt
uvicorn main:app --reload --port 8000Worker (optional):
cd worker
python -m venv .venv && . .venv/bin/activate
pip install -r ../services/api/requirements.txt
python worker.pyUI:
cd apps/streamlit_ui
python -m venv .venv && . .venv/bin/activate
pip install -r requirements.txt
streamlit run app.pyThis project ships with BYOK only. Provide keys at runtime in the UI.
- LLM and embeddings use an OpenAI‑compatible API
- Default base URL:
https://api.openai.com/v1 - Works with OpenAI, Groq, OpenRouter, etc. (set
base_url)
apps/streamlit_ui Streamlit UI
services/api FastAPI backend + RAG pipeline
worker/ RQ worker for ingestion
data/ Local persistence (mounted in Docker)
User -> Streamlit UI -> FastAPI
| |
| +-> SQLite (metadata, FTS5)
| +-> Chroma (vectors)
|
+-> Uploads -> Worker (RQ) -> Ingest -> Chroma + SQLite
All data persists under ./data/ (mounted in Docker):
data/sqlite/app.db(metadata + FTS)data/chroma/(vector db)data/blobs/(uploaded files)
- Ingestion fails at ~75%: check embeddings API key and embedding model.
429 Too Many Requests: LLM provider rate‑limited; wait or use another provider.- No citations: ensure documents are
readyand retrieval returns chunks.
For async ingestion, the server stores the embedding API key in ingest_jobs (plain‑text JSON). Encrypt or replace with per‑user secret storage in production.
