This document explains how to load useful context quickly when changing LocalRAG: where code lives, what to read first, and which project rules apply.
Agents (and humans) move faster when they:
- Start from stable anchors — README,
pyproject.toml,.env.example, andarchitecture.mdbefore opening random modules. - Route by symptom — ingest bugs →
localrag/ingestion/; HTTP contract →localrag/api/routers/; RAG quality →localrag/rag/and chunk/embed settings. - Respect the toolchain — Python 3.13+; dependencies and commands go through uv (
uv sync,uv run …). See README and.cursor/rules/project-setup.mdc. - Avoid duplicating rules — Non-obvious coding constraints live in
.cursor/rules/(critical rules, Python style, testing, Grug-style preferences). Read those when editing Python, not a second copy here.
- README — what LocalRAG does, quick start, API entry command.
pyproject.toml— dependencies, script entrylocalrag = localrag.cli.app:app, Ruff/pytest config..env.example— canonical env var names and defaults (mirrorsSettingsinlocalrag/settings.py).- architecture.md — layers, data flow, extension points.
- The specific file(s) for your task (see table below).
| Task | Primary locations |
|---|---|
| Environment / defaults | localrag/settings.py, .env.example |
| FastAPI routes (HTTP only) | localrag/api/routers/*.py |
| API request/response OpenAPI models | localrag/api/schemas.py |
| API use cases (health, ingest rules, query JSON + SSE, collections including rebuild) | localrag/api/service.py |
| API persistence boundary (Chroma collections) | localrag/api/repository.py |
| API app factory (lifespan, middleware, error handlers) | localrag/api/main.py |
HTTP ingest path validation (INGEST_ROOTS, URL decode) |
localrag/api/service.py, localrag/settings.py (is_path_allowed), localrag/api/exceptions.py + main.py handler |
| DI / shared service instances | localrag/api/dependencies.py |
| Log format, levels, request ID | localrag/logging_config.py, localrag/api/middleware.py, LOG_LEVEL in localrag/settings.py |
| API key auth | localrag/api/dependencies.py (require_api_key), API_KEY in localrag/settings.py |
| Prometheus metrics endpoint | localrag/api/routers/metrics.py |
| LLM provider abstraction | localrag/llm/providers/, localrag/llm/factory.py |
| Cost estimation | localrag/llm/costs.py |
| Agent tool-use (search_documents / answer_directly) | localrag/agent/service.py, localrag/api/routers/agent.py |
| Architecture decisions | docs/adr/ |
| CLI commands | localrag/cli/app.py, localrag/cli/commands/*.py |
| Parsing a file type | localrag/ingestion/parsers/, localrag/ingestion/loader.py |
| Chunking strategy and boundaries | localrag/ingestion/structural_chunker.py, localrag/ingestion/chunker.py, localrag/settings.py |
| Embeddings / Ollama HTTP for embed | localrag/ingestion/embedder.py |
| Ingest orchestration | localrag/ingestion/service.py |
| Chroma collection / persist path | localrag/storage/vector_store.py, settings |
| Retrieval mode / hybrid ranking / freshness decay | localrag/rag/retriever.py, localrag/rag/bm25_index.py, localrag/settings.py |
| Ollama HTTP request/response shapes | localrag/ollama/schemas.py (used by embedder, RAG engine, health, setup) |
| Prompt / answer streaming | localrag/rag/prompt.py, localrag/rag/engine.py |
| Human Ollama install (not Python) | ollama.md |
uv sync
uv run localrag --help
uv run pytest
uv run ruff format .
uv run ruff check .Pre-commit and contribution workflow: .github/CONTRIBUTING.md.
- Ollama runs outside the repo (CLI or Docker). LocalRAG talks over HTTP using
OLLAMA_BASE_URLand model env vars. - Chroma data is local filesystem under
CHROMA_PERSIST_PATH(see.env.example).
Update agent-navigation.md if you add major entry points, move packages, or change the “read order” anchors. Update architecture.md if layers, routers, schemas/services/repositories, or ingest/RAG flow change materially. See AGENTS.md for the full “documentation maintenance for agents” rule.