Retrieval-augmented FAQ assistant for ELTE policy and administration questions.
- Backend: FastAPI + LangChain + FAISS + BM25 + optional rerankers (
off,cross_encoder,llm) - Frontend: Vite + React + TypeScript + Tailwind (chat + admin)
- Ingestion: Typesense document sync + Docling for PDFs + normalized JSON for news
- Deployment: Docker Compose (backend + frontend)
# install dependencies with uv
uv sync --extra dev
# or install with pip
python -m venv .venv
.venv\Scripts\python -m pip install -r requirements.txt
# run with uv
uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 8001
# or run from the virtual environment
.venv\Scripts\uvicorn app.main:app --reload --host 0.0.0.0 --port 8001cd frontend
npm install
npm run devFrontend uses VITE_API_BASE_URL (frontend/.env.example).
cd extension
npm install
npm run buildLoad the unpacked extension from extension/dist in Chrome (chrome://extensions).
- Injection scope:
https://inf.elte.hu/*andhttp://inf.elte.hu/* - Runtime API URL: configurable in extension options page
- Default API URL:
http://localhost:8001(orEXT_DEFAULT_API_BASE_URLat build time) - For local demo backend CORS can is set to
*(CORS_ALLOW_ORIGINS=*)
docker compose up --build- NVIDIA GPU override (Windows/Linux hosts with NVIDIA container runtime):
docker compose -f docker-compose.yml -f docker-compose.gpu.yml up --build- Frontend: http://localhost:5173
- Backend API: http://localhost:8001/docs
- Upload/delete source PDFs in Admin → Embeddings and Files.
- Run Documents Sync to fetch official ELTE document links from Typesense and download PDF files.
- Run Reindex Vector Store to rebuild FAISS from local PDFs + normalized news.
- Run News Index → Bootstrap/Sync manually when you want to refresh news coverage.
Documents sync and reindex are intentionally separate operations. News sync is also manual-only (no background periodic polling).
Page-level citations depend on chunk metadata captured during ingestion. After ingestion logic changes, run a full reindex to refresh stored metadata.
- Document indexes are now snapshot-based under
data/indexes/<snapshot-id>/. - Active index selection is profile-specific (
local_minilm,openai_small,openai_large) and stored indata/runtime/active_indexes.json. - Reindex creates a new immutable snapshot and updates the active pointer for the selected embedding profile.
- Runtime query usage is logged to
data/runtime/usage_log.jsonl(one JSON line per/askcall). - Each
/askresponse includesrequest_id, which can be used to attach user feedback. - Feedback endpoint:
POST /feedbackwith{ "request_id": "...", "helpful": true|false }
- Admin endpoints:
GET /admin/usage?limit=200GET /admin/usage/stats?window_days=7
Run the fixed-question evaluation against a live backend:
uv run python scripts/run_evaluation.py --api-base-url http://127.0.0.1:8001Artifacts:
data/eval/latest_metrics.jsondata/eval/latest_metrics.md
Staged benchmark matrix (single-turn + multi-turn):
uv run python scripts/run_benchmarks.py --api-base-url http://127.0.0.1:8001Solid-only full matrix benchmark (46 manually accepted gold rows):
uv run python scripts/run_benchmarks.py \
--api-base-url http://127.0.0.1:8001 \
--plan data/eval/benchmark_plan_solid_full_matrix.json \
--single-turn data/eval/questions_solid_v2.json \
--multi-turn data/eval/multi_turn_questions_solid_v2.json \
--gold-set data/eval/gold_set_v2.json \
--judge-model ""