You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AI-powered municipal budget intelligence platform. Upload a government budget PDF, ask plain-language questions, and get cited answers with interactive analytics.
Retry policy: 3 attempts, exponential backoff (5 s initial). Failed jobs surface in Bull Board at /queues.
RAG System
The RAG pipeline runs on POST /rag/query and returns a cited Markdown answer plus structured output (chart type, KPIs, anomalies).
Search Strategy
User Question
│
├─► Embed question → vector(768)
│
├─► Vector search (cosine similarity ≥ 0.2, top 20)
│
├─► Full-Text search (PostgreSQL tsvector, top 20)
│
└─► RRF merge (Reciprocal Rank Fusion) → ranked top 15 chunks
│
▼
LLM (Gemini / Ollama)
System: government budget analyst
Context: top 15 chunks + last 6 conversation messages
Output: Markdown answer + JSON { chartType, confidence, kpis[], anomalies[] }
Caching
RAG responses are cached in Redis (RAG_CACHE_TTL_SECONDS, default 3600 s). The cache key is derived from the query embedding so semantically identical questions hit the same cache entry.
Multilingual
OCR supports 7 Indian languages. Query language is auto-detected; the LLM responds in the same language as the question.
API Overview
All endpoints share the prefix /api/v1. Every response is wrapped by ResponseInterceptor:
# 1. Clone and install
git clone <repo-url>cd budget-lens-backend
npm install
# 2. Environment
cp .env.example .env
# Edit .env — set LLM_PROVIDER, GEMINI_API_KEY or OLLAMA_BASE_URL# 3. Start infrastructure
docker compose up -d
# Starts: PostgreSQL, Redis, Ollama, Prometheus, Grafana, exporters# 4. Run migrations
npm run db:migration:run
# 5. (Optional) Download OCR language packs locally
npm run tessdata:download
# 6. Start API
npm run start:dev
The API will be available at http://localhost:3001/api/v1.
Environment Variables
Copy .env.example to .env and fill in the required values. Every variable below maps 1-to-1 with a key in .env.example.
App
Variable
Default
Description
PORT
3001
API listen port
NODE_ENV
development
development | production
APP_NAME
Nest Starter
Application display name
APP_PREFIX
nest-starter
Internal prefix used in logs
APP_URL
http://localhost:3001
Public base URL of this API
FALLBACK_LANGUAGE
en
i18n fallback locale
SENTRY_DSN
—
Sentry error-reporting DSN (optional)
CRON_ENABLED
true
Enable scheduled cron tasks
CORS & Cookie
Variable
Default
Description
CORS_ALLOWED_ORIGINS
http://localhost:5173
Comma-separated list of allowed origins
CORS_ALLOW_NON_BROWSER_CLIENTS
true
Allow API clients without an Origin header
COOKIE_SECURE
false
Set true in production (HTTPS only)
FRONTEND_URL
http://localhost:5173
Frontend URL used in password-reset emails
Database
Variable
Default
Description
DATABASE_URL
—
Full PostgreSQL connection string (overrides individual fields)
DATABASE_USER
postgres
Database user
DATABASE_PASSWORD
postgres
Database password
DATABASE_NAME
nest_db
Database name
DATABASE_PORT
5432
PostgreSQL port
DB_POOL_MAX
10
TypeORM connection pool maximum
DB_POOL_MIN
2
TypeORM connection pool minimum
Redis
Variable
Default
Description
REDIS_HOST
localhost
Redis host
REDIS_PORT
6379
Redis port
REDIS_PASSWORD
—
Redis password (leave empty for no auth)
REDIS_TLS
false
Enable TLS for Redis connection
JWT
Variable
Default
Description
JWT_ACCESS_SECRET_KEY
—
Required. Access token signing secret
JWT_REFRESH_SECRET_KEY
—
Required. Refresh token signing secret
JWT_ACCESS_TOKEN_EXPIRE
1d
Access token TTL
JWT_REFRESH_TOKEN_EXPIRE
7d
Refresh token TTL
Swagger
Variable
Default
Description
SWAGGER_USERNAME
admin
HTTP Basic username for Swagger UI
SWAGGER_PASSWORD
admin123
HTTP Basic password for Swagger UI
Throttling
Variable
Default
Description
THROTTLE_TTL
60000
Rate-limit window in milliseconds
THROTTLE_LIMIT
100
Max requests per window per IP
LLM Provider
Variable
Default
Description
LLM_PROVIDER
ollama
ollama (local) or gemini (cloud)
Switching embedding providers requires re-ingesting all documents to rebuild the vector index.
Ollama (used when LLM_PROVIDER=ollama):
Variable
Default
Description
OLLAMA_BASE_URL
http://localhost:11434
Ollama API base URL
OLLAMA_LLM_MODEL
gemma4:e4b
Ollama model for text generation
OLLAMA_EMBEDDING_MODEL
nomic-embed-text
Ollama model for embeddings
OLLAMA_EMBEDDING_DIMENSIONS
768
Embedding vector dimensions
OLLAMA_REQUEST_TIMEOUT
300000
Request timeout in milliseconds
Google Gemini (used when LLM_PROVIDER=gemini):
Variable
Default
Description
GEMINI_API_KEY
—
Required when LLM_PROVIDER=gemini
GEMINI_LLM_MODEL
gemini-2.0-flash
Gemini model for text generation
GEMINI_EMBEDDING_MODEL
gemini-embedding-002
Gemini model for embeddings (768-dim)
GEMINI_EMBEDDING_DIMENSIONS
768
Embedding vector dimensions
GEMINI_REQUEST_TIMEOUT
120000
Request timeout in milliseconds
RAG Retrieval
Variable
Default
Description
RAG_CACHE_ENABLED
true
Cache RAG responses in Redis
RAG_CACHE_TTL_SECONDS
3600
RAG cache TTL in seconds
RAG_MIN_SIMILARITY
0.2
Minimum cosine similarity for vector-only search
RAG_HYBRID_SEARCH
true
Enable RRF hybrid (vector + full-text) search
OCR
Variable
Default
Description
OCR_ENABLED
true
Enable Tesseract OCR for scanned PDFs
OCR_LANGUAGES
eng+hin+guj+mar+tam+tel+ben
+-separated Tesseract language packs
OCR_MIN_CHARS_PER_PAGE
50
Pages below this char count trigger OCR
OCR_RENDER_SCALE
2
Scale factor for page-to-image rendering
OCR_CACHE_PATH
.tesseract-cache
Directory for cached tessdata files
OCR_LANG_PATH
.tesseract-cache
Path Tesseract reads language data from (set after npm run tessdata:download)
File Upload
Variable
Default
Description
UPLOAD_DEST
./uploads
Local directory for uploaded PDFs
UPLOAD_MAX_FILE_SIZE_MB
100
Maximum size per uploaded file (MB)
UPLOAD_MAX_FILES
20
Maximum number of files per request
Monitoring
Variable
Default
Description
PROMETHEUS_PORT
9090
Prometheus container port (Docker Compose)
GRAFANA_PORT
3002
Grafana container port (Docker Compose)
POSTGRES_EXPORTER_PORT
9187
postgres_exporter scrape port
REDIS_EXPORTER_PORT
9121
redis_exporter scrape port
GRAFANA_ADMIN_USER
admin
Grafana admin username
GRAFANA_ADMIN_PASSWORD
admin
Grafana admin password
Available Commands
# Development
npm run start:dev # Watch mode with hot reload
npm run start # Normal start# Build
npm run build # Compile TypeScript → dist/
npm run start:prod # Run compiled build# Database
npm run db:migration:generate # Auto-generate migration from entity changes
npm run db:migration:create # Create empty migration file
npm run db:migration:run # Apply all pending migrations
npm run db:migration:revert # Rollback last migration
npm run db:reset # Drop + recreate schema (destructive — dev only)# Testing
npm run test# Unit tests (Jest)
npm run test:e2e # End-to-end tests
npm run test:cov # Coverage report# Code quality
npm run lint # ESLint with auto-fix# Utilities
npm run tessdata:download # Download Tesseract OCR language packs
Local URLs
Service
URL
Notes
API Base
http://localhost:3001/api/v1
All REST endpoints
Swagger UI
http://localhost:3001/swagger
Interactive API docs
Bull Board
http://localhost:3001/queues
Job queue dashboard
Raw Metrics
http://localhost:3001/api/v1/metrics
Prometheus scrape target
Prometheus
http://localhost:9090
Metrics query UI
Grafana
http://localhost:3002
Dashboards (admin / admin)
PostgreSQL
localhost:5432
DB: nest_db, user: postgres
Redis
localhost:6379
No auth in dev
Ollama
http://localhost:11434
Local LLM API
Future Scope
RAG Quality Improvements
Improvement
Description
Re-ranking with cross-encoder
Add a cross-encoder pass (e.g. ms-marco-MiniLM) after RRF to re-score top-15 chunks by semantic relevance to the query before sending to the LLM — reduces hallucination from weakly-matched chunks
Query expansion / HyDE
Generate a hypothetical answer (Hypothetical Document Embedding) and use its embedding for retrieval, improving recall on vague or short queries
Adaptive chunking
Replace fixed 512-token windows with semantic chunking (split on section boundaries and heading changes) for budget PDFs where table rows and headings carry structural meaning
Contextual chunk headers
Prepend each chunk with its section path (FY 2024 > Department: Health > Expenditure) before embedding, so vector search captures document structure alongside content
MMR diversity
Apply Maximal Marginal Relevance when selecting final context chunks to reduce redundancy and maximise coverage of distinct budget facets in one answer
Long-context summarisation
For queries that span an entire budget year, use a map-reduce pattern: summarise each chunk group independently, then synthesise — avoids context window overflow on large PDFs
Structured output validation
Validate LLM-returned JSON (chartType, kpis[], anomalies[]) against a Zod schema; retry with corrective prompt on schema mismatch instead of silently dropping structured output
Embedding model fine-tuning
Fine-tune the embedding model on a domain corpus of Indian government budget documents to improve similarity scores for financial terminology and abbreviations (CAG, ULB, CSS, MPLAD)
Personalised retrieval context
Incorporate user profile (city, profession) into the retrieval prompt so a ward councillor and a journalist asking the same question receive differently scoped answers
Anomaly-aware retrieval
Post-retrieval, run a lightweight statistical pass (z-score on budget variances) over retrieved chunks before LLM generation; inject flagged anomalies as additional context
User Feedback Loop for Prompt Improvement
Closing the feedback loop between real user interactions and prompt quality is the single highest-leverage improvement after initial launch.
Feature
Description
Per-answer thumbs up / down
Capture explicit signal on every RAG response (chat_messages.feedbackRating). Store alongside the query, retrieved chunk IDs, and LLM prompt used
Freeform correction input
Allow the user to type the correct answer when they mark a response as wrong. Store as a correction field tied to the original response
Feedback-weighted prompt selection
Build a prompt template registry (prompt_templates table). A/B test prompt variants across users; compute win-rate per template from feedback signals; promote the highest-rated template as the default
Automatic low-confidence flagging
When extractionConfidence < 0.7 or the LLM returns "I don't know", auto-flag the exchange for human review rather than silently delivering a poor answer
Feedback analytics dashboard
Admin endpoint that surfaces: answer acceptance rate by query type, most-corrected question patterns, low-confidence document clusters — giving product insight into where the pipeline breaks
RLHF-style prompt iteration
Periodically export rated (query, answer, rating) triples and use them to refine the system prompt in a supervised way — moving from static prompts to data-driven prompt evolution
Retrieval feedback attribution
When a user thumbs-down a response, record which chunks were retrieved. Over time, chunks with high association to low-rated answers can be flagged for re-extraction or re-embedding
Other Planned Improvements
Area
Improvement
Automated anomaly detection
Statistical outlier flagging (z-score, IQR) on budget line items; surface year-on-year spending spikes automatically without a user query
Multi-year trend comparison
Align budget records across fiscal years by department and category for long-range trend charts
Ward-level report card
Auto-generate a shareable one-page PDF report card comparing a ward's allocation to the city average
On-demand URL ingestion
Accept a public URL (POST /documents/from-url) to download and ingest a PDF directly from government portals (e.g. Open Budgets India, city corporation websites) without requiring the user to download and re-upload — includes SSRF protection and optional domain allowlist
Live government portal ingestion
Scheduled crawler to automatically discover and ingest new budget publications from Open Budgets India and city corporation portals
Role-based dashboard views
Separate dashboard presets for citizens (simplified), journalists (anomaly-focused), and auditors (full line items)
Multi-city comparative analysis
Cross-document RAG queries that compare spending across multiple municipalities in a single answer
Webhook notifications
Notify users when their document finishes processing via webhook or email
Mobile-optimised API responses
Lighter response payloads and pagination defaults tuned for mobile clients