Skip to content

Nagavenkatasai7/context-aware-research-chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– Context-Aware Research Chatbot

A sophisticated conversational agent that answers domain questions about AI policy using web search, local RAG (Retrieval-Augmented Generation), and mathematical tools with comprehensive source citations.

✨ Features

  • 🧠 Multi-Modal Intelligence: Combines web search, local knowledge base, and mathematical calculations
  • 🎯 Smart Routing: Automatically routes queries to the most appropriate tool based on intent
  • πŸ’¬ Conversational Memory: Maintains context across conversations with session management
  • πŸ“š Source Citations: Provides detailed source attributions for all responses
  • πŸ“Š Comprehensive Evaluation: Built-in evaluation framework for faithfulness and groundedness
  • 🎨 Multiple Interfaces: FastAPI backend, Streamlit UI, and Gradio interface
  • πŸ—οΈ Scalable Architecture: Modular design using LangChain components

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • OpenAI API key
  • (Optional) SerpAPI or Tavily API key for web search
  • PDF documents for your domain knowledge base

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/context-aware-research-chatbot.git
cd context-aware-research-chatbot
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up environment variables:
cp .env.example .env
# Edit .env with your API keys
  1. Initialize the project:
python main.py setup
  1. Add your PDF documents:
# Place your PDF files in data/pdfs/
cp your-documents/*.pdf data/pdfs/
  1. Process documents:
python main.py process-pdfs
  1. Test the system:
python main.py test

πŸ–₯️ Usage

Option 1: Streamlit UI (Recommended)

streamlit run simple_demo.py --server.port 8501

Access at: http://localhost:8501

Option 2: FastAPI + Streamlit UI

# Terminal 1: Start API
python main.py start-api

# Terminal 2: Start UI
python main.py start-ui

Option 3: Gradio Interface

python gradio_ui.py

Access at: http://localhost:7860

🎯 Sample Queries

Try these questions with your AI policy dataset:

  • Policy Questions: "What are the key AI safety guidelines?"
  • Regulatory: "How does GDPR apply to AI systems?"
  • Ethics: "What are the ethical considerations for AI deployment?"
  • Math: "Calculate 15% of 250,000"
  • Complex: "How do AI policy frameworks address bias in algorithmic decision-making?"

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Streamlit UI  β”‚    β”‚   Gradio UI     β”‚    β”‚   FastAPI      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚                      β”‚                      β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚    Context-Aware         β”‚
                    β”‚    Research Chatbot      β”‚
                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
                    β”‚  β”‚   Query Router      β”‚ β”‚
                    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
                    β”‚  β”‚      Tools          β”‚ β”‚
                    β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”     β”‚ β”‚
                    β”‚  β”‚ β”‚ RAG β”‚ β”‚ Web β”‚     β”‚ β”‚
                    β”‚  β”‚ β””β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”˜     β”‚ β”‚
                    β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β”             β”‚ β”‚
                    β”‚  β”‚ β”‚Math β”‚             β”‚ β”‚
                    β”‚  β”‚ β””β”€β”€β”€β”€β”€β”˜             β”‚ β”‚
                    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
                    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
                    β”‚  β”‚   Memory Manager    β”‚ β”‚
                    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚      Data Layer          β”‚
                    β”‚ β”Œβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”  β”‚
                    β”‚ β”‚FAISSβ”‚ β”‚SQLiteβ”‚ β”‚PDFs β”‚  β”‚
                    β”‚ β””β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”˜  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

context-aware-research-chatbot/
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ .env.example
β”œβ”€β”€ .gitignore
β”œβ”€β”€ main.py                 # Main CLI interface
β”œβ”€β”€ config.py               # Configuration management
β”œβ”€β”€ simple_demo.py          # Simplified Streamlit demo
β”œβ”€β”€ data_processor.py       # PDF processing & vector store
β”œβ”€β”€ tools.py                # Web search, math, RAG tools
β”œβ”€β”€ chatbot.py              # Core chatbot logic
β”œβ”€β”€ database.py             # Database models & management
β”œβ”€β”€ api.py                  # FastAPI backend
β”œβ”€β”€ streamlit_ui.py         # Streamlit frontend
β”œβ”€β”€ gradio_ui.py            # Gradio frontend
β”œβ”€β”€ evaluation.py           # Evaluation framework
└── data/                   # Data directory
    β”œβ”€β”€ pdfs/               # Place your PDF files here
    β”œβ”€β”€ vector_store/       # Generated vector store
    └── eval_dataset.json   # Evaluation dataset

βš™οΈ Configuration

Key configuration options in .env:

# Required
OPENAI_API_KEY=your_key_here

# Optional - for web search
SERPAPI_API_KEY=your_serpapi_key
TAVILY_API_KEY=your_tavily_key

# Model settings
LLM_MODEL=gpt-3.5-turbo
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2

# Data settings
CHUNK_SIZE=1000
CHUNK_OVERLAP=200
TOP_K_RETRIEVAL=5

πŸ“Š Evaluation

Run comprehensive evaluation:

python main.py eval

The system evaluates responses on:

  • Faithfulness: Accuracy to source material
  • Relevance: Response relevance to questions
  • Tool Routing: Correct tool selection
  • Source Quality: Citation accuracy

πŸ§ͺ Testing

Run the test suite:

python main.py test

πŸ”§ API Usage

import requests

# Create session
response = requests.post("http://localhost:8000/sessions", 
                        json={"user_id": "your_user_id"})
session_id = response.json()["session_id"]

# Chat
response = requests.post("http://localhost:8000/chat", json={
    "message": "What are the latest AI safety guidelines?",
    "session_id": session_id
})

result = response.json()
print(f"Response: {result['response']}")
print(f"Tool used: {result['tool_used']}")
print(f"Sources: {result['sources']}")

🎨 Customization

Adding New Tools

  1. Create tool class in tools.py
  2. Update router logic
  3. Integrate in chatbot.py

Custom Evaluation Metrics

Add evaluators in evaluation.py for domain-specific metrics.

πŸ“ˆ Monitoring

Track conversation statistics, tool usage patterns, and performance metrics through the built-in monitoring system.

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Run the test suite
  5. Submit a pull request

πŸ“„ License

This project is licensed under the MIT License.

πŸ™ Acknowledgments


Happy Research! πŸ€–πŸ“š

About

Context-aware research assistant chatbot with RAG pipeline, conversation memory, and document Q&A. Built with Python, OpenAI GPT-4o, and vector embeddings for intelligent academic research support.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors