Skip to content

PolicyRAG : A RAG Based policy agent designed to guide through the process of finding the best policy that fits the requirement with no effort of reading

License

Notifications You must be signed in to change notification settings

Srujanrana07/RAG-Based-policy-agent

Β 
Β 

Repository files navigation

πŸ” RAG-Based Policy Agent

Enterprise-Grade Policy Intelligence Platform Powered by Retrieval-Augmented Generation (RAG)

Transform fragmented regulatory PDFs into a searchable, intelligent, context-grounded policy assistant using semantic embeddings, FAISS retrieval, and Google Gemini reasoning.


πŸ“˜ Overview

RAG-Based Policy Agent is a production-ready platform that converts PDF-based policies, compliance rules, and regulatory frameworks into a real-time policy intelligence engine.

It enables policy teams to instantly search, retrieve, and reason over internal documents β€” with grounded, traceable, and compliant answers powered by RAG.

Designed for:

  • πŸ›‘ Compliance & Audit
  • βš– Legal & Governance
  • 🏒 Corporate Policy Teams
  • πŸ” Internal Risk & Regulatory Operations

🧠 Why This Product Matters

Organizations face challenges with:

  • Thousands of pages of policy documents
  • Manual search through static PDFs
  • Slow compliance decision-making
  • Risk of misinterpretation or oversight

This platform provides:

βœ“ Real-time policy intelligence
βœ“ Explainable, context-grounded LLM answers
βœ“ High-precision document extraction
βœ“ Enterprise-level consistency & reliability


πŸ—‚ Project Structure

srujanrana07-rag-based-policy-agent/
β”‚
β”œβ”€β”€ README.md                 
β”œβ”€β”€ LICENSE
β”‚
β”œβ”€β”€ app.py                    # Flask REST API server
β”œβ”€β”€ main.py                   # Core RAG workflow runner
β”œβ”€β”€ deploy.sh                 # Deployment script
β”‚
β”œβ”€β”€ document_loader.py        # PDF/table loader & preprocessing
β”œβ”€β”€ vectorizer.py             # Sentence Transformer embeddings
β”œβ”€β”€ retriever.py              # FAISS vector store operations
β”œβ”€β”€ gpt_client.py             # Gemini LLM wrapper
β”œβ”€β”€ submitter.py              # Batch inference + evaluation
β”‚
β”œβ”€β”€ answers_output.json       # Output results
β”œβ”€β”€ questions.json            # Test questions dataset
β”œβ”€β”€ temp.json                 # Intermediate processing store
β”‚
β”œβ”€β”€ requirements.txt
└── log.txt

πŸ— System Architecture

diagram diagram - Copy


πŸš€ Key Features

1️⃣ Intelligent PDF Ingestion

  • High-fidelity extraction via GhostScript
  • Accurate table detection through Camelot-py
  • Clean preprocessing for long-form policies

2️⃣ Embedding & Vectorization

  • Sentence Transformer embeddings
  • Chunking + semantic scoring
  • Stored and indexed for fast retrieval

3️⃣ FAISS-Powered Semantic Search

  • Millisecond-level semantic retrieval
  • Top-k contextual selection
  • Deterministic policies for safe RAG

4️⃣ Context-Grounded LLM Reasoning

  • Gemini LLM integrated through gpt_client.py
  • Strict grounding in FAISS-retrieved context
  • No hallucinations β†’ verified policy answers

5️⃣ Modular, Production-Ready REST API

  • Upload documents
  • Extract structured content
  • Ask policy questions
  • Plug into dashboards, portals, BI tools

βš™ Installation

Clone Repository

git clone https://github.com/srujanrana07/rag-based-policy-agent
cd rag-based-policy-agent

Create Virtual Environment

python -m venv venv
source venv/bin/activate
# Windows: venv\Scripts\activate

Install Dependencies

pip install -r requirements.txt

Configure Environment

Create a .env:

GEMINI_API_KEY=your_api_key

β–Ά Usage

Start API Server

python app.py

Run End-to-End RAG Pipeline

python main.py

Run Batch Q&A Evaluation

python submitter.py

πŸ”Œ API Endpoints

Method Endpoint Description
POST /upload Upload a policy PDF
GET /extract Extract text + tables
POST /query Ask a question using RAG

πŸ” Security & Reliability

  • Sanitized PDF ingestion
  • Deterministic FAISS retrieval
  • Strictly grounded LLM answers
  • Full logging in log.txt
  • Cloud-ready and container-friendly

🐳 Deployment

Docker

docker build -t rag-policy-agent .
docker run -p 5000:5000 rag-policy-agent

Linux (Auto Setup)

./deploy.sh

Supports deployment on:

  • AWS (ECS, EC2, Lambda)
  • GCP (Cloud Run)
  • Azure App Service
  • Railway / Render

🀝 Contributing

Pull requests and improvements are welcome.


πŸ“„ License

MIT License β€” free for personal and commercial use.

About

PolicyRAG : A RAG Based policy agent designed to guide through the process of finding the best policy that fits the requirement with no effort of reading

Topics

Resources

License

Stars

Watchers

Forks

Languages

  • Python 91.8%
  • Shell 8.2%