Transform fragmented regulatory PDFs into a searchable, intelligent, context-grounded policy assistant using semantic embeddings, FAISS retrieval, and Google Gemini reasoning.
RAG-Based Policy Agent is a production-ready platform that converts PDF-based policies, compliance rules, and regulatory frameworks into a real-time policy intelligence engine.
It enables policy teams to instantly search, retrieve, and reason over internal documents β with grounded, traceable, and compliant answers powered by RAG.
Designed for:
- π‘ Compliance & Audit
- β Legal & Governance
- π’ Corporate Policy Teams
- π Internal Risk & Regulatory Operations
Organizations face challenges with:
- Thousands of pages of policy documents
- Manual search through static PDFs
- Slow compliance decision-making
- Risk of misinterpretation or oversight
This platform provides:
β Real-time policy intelligence
β Explainable, context-grounded LLM answers
β High-precision document extraction
β Enterprise-level consistency & reliability
srujanrana07-rag-based-policy-agent/
β
βββ README.md
βββ LICENSE
β
βββ app.py # Flask REST API server
βββ main.py # Core RAG workflow runner
βββ deploy.sh # Deployment script
β
βββ document_loader.py # PDF/table loader & preprocessing
βββ vectorizer.py # Sentence Transformer embeddings
βββ retriever.py # FAISS vector store operations
βββ gpt_client.py # Gemini LLM wrapper
βββ submitter.py # Batch inference + evaluation
β
βββ answers_output.json # Output results
βββ questions.json # Test questions dataset
βββ temp.json # Intermediate processing store
β
βββ requirements.txt
βββ log.txt
- High-fidelity extraction via GhostScript
- Accurate table detection through Camelot-py
- Clean preprocessing for long-form policies
- Sentence Transformer embeddings
- Chunking + semantic scoring
- Stored and indexed for fast retrieval
- Millisecond-level semantic retrieval
- Top-k contextual selection
- Deterministic policies for safe RAG
- Gemini LLM integrated through
gpt_client.py - Strict grounding in FAISS-retrieved context
- No hallucinations β verified policy answers
- Upload documents
- Extract structured content
- Ask policy questions
- Plug into dashboards, portals, BI tools
git clone https://github.com/srujanrana07/rag-based-policy-agent
cd rag-based-policy-agentpython -m venv venv
source venv/bin/activate
# Windows: venv\Scripts\activatepip install -r requirements.txtCreate a .env:
GEMINI_API_KEY=your_api_key
python app.pypython main.pypython submitter.py| Method | Endpoint | Description |
|---|---|---|
| POST | /upload |
Upload a policy PDF |
| GET | /extract |
Extract text + tables |
| POST | /query |
Ask a question using RAG |
- Sanitized PDF ingestion
- Deterministic FAISS retrieval
- Strictly grounded LLM answers
- Full logging in
log.txt - Cloud-ready and container-friendly
docker build -t rag-policy-agent .
docker run -p 5000:5000 rag-policy-agent./deploy.shSupports deployment on:
- AWS (ECS, EC2, Lambda)
- GCP (Cloud Run)
- Azure App Service
- Railway / Render
Pull requests and improvements are welcome.
MIT License β free for personal and commercial use.

