Skip to content

aniketpoojari/Agentic-AI-Research-Assistant

Repository files navigation

title Agentic AI Research Assistant
emoji 🧠
colorFrom blue
colorTo purple
sdk docker
app_file app.py
pinned false

Agentic AI Research Assistant

Python FastAPI Streamlit LangGraph Docker HuggingFace

CI/CD Pipeline License: MIT Python 3.10+ HuggingFace Space

An autonomous AI agent built to tackle one of the biggest problems with LLMs: hallucinations.

When doing research, you can't afford confidently stated incorrect facts. This agent solves that by searching the web, evaluating its own findings, and fixing its mistakes before giving you an answer.

πŸš€ Try it live on Hugging Face Spaces: Agentic AI Research Assistant


How It Works

Instead of a simple "prompt -> response" pipeline, this agent uses a self-reflection loop (powered by LangGraph):

  1. Generate: It drafts an initial response using web search results.
  2. Critique: It fact-checks every single claim it just made against the real web evidence.
  3. Refine: If it isn't completely confident (score < 0.7), it goes back to search for more data and rewrites its answer.

Architecture

graph TD
    Start([User Query]) --> Agent
    Agent{{Decide Action}}

    Agent -->|Needs Info| Tools[Web Search / Summarize]
    Tools --> Agent

    Agent -->|Draft Response| Critic[Self-Reflection Node]

    Critic -->|Confidence < 0.7| Agent
    Critic -->|Confidence >= 0.7| Final([Final Response])

    subgraph Reflection Loop
    Critic -.->|Feedback + Retry| Agent
    end
Loading

Tech Stack

Component Technology
Agent framework LangGraph
LLM Groq (Llama 3.x)
Web search Tavily / DuckDuckGo
Evaluation Ragas
Tracing LangSmith
Backend API FastAPI
Frontend Streamlit
Deployment Docker, HuggingFace Spaces
CI/CD GitHub Actions

Quick Start

# Clone
git clone https://github.com/aniketpoojari/Agentic-AI-Research-Assistant.git
cd Agentic-AI-Research-Assistant

# Install
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

# Configure
cp .env.example .env
# Edit .env with your API keys (see Environment Variables below)

# Run
uvicorn main:app --reload          # API at http://localhost:8000
streamlit run app.py               # UI at http://localhost:8501

Docker

docker build -t research-assistant .
docker run -p 7860:7860 -p 8000:8000 \
  -e GROQ_API_KEY=your_key \
  -e TAVILY_API_KEY=your_key \
  research-assistant

Environment Variables

Variable Required Description
GROQ_API_KEY Yes Groq API key for LLM inference
TAVILY_API_KEY Yes Tavily API key for web search
MODEL_NAME No Model name (default: llama-3.1-8b-instant)
LANGCHAIN_API_KEY No LangSmith API key for tracing

API Endpoints

Method Endpoint Description
POST /research Run a research query
POST /research/stream Stream research results (SSE)
GET /health Health check
GET /reflection-stats Self-reflection metrics
GET /cache/stats Cache hit rates
GET /metrics Performance metrics

Example Request

curl -X POST http://localhost:8000/research \
  -H "Content-Type: application/json" \
  -d '{"query": "What are the latest breakthroughs in solid-state batteries?", "max_results": 5}'

Evaluation

The project includes two evaluation systems:

  • evaluation/ -- Ragas evaluation that scores the agent on faithfulness, relevancy, context precision, and recall. Runs automatically in CI via GitHub Actions.
  • benchmarking/ -- Comparative benchmark that tests the agent head-to-head against a baseline LLM on the same queries.
# Run evaluation
python evaluation/run_evaluation.py

# Run comparative benchmark
python -m benchmarking.benchmark --num 10

Project Structure

.
β”œβ”€β”€ agent/              # LangGraph agent workflow
β”œβ”€β”€ app.py              # Streamlit frontend
β”œβ”€β”€ benchmarking/       # Agent vs baseline comparison
β”œβ”€β”€ config/             # YAML configuration
β”œβ”€β”€ evaluation/         # Ragas evaluation + test queries
β”œβ”€β”€ logger/             # Logging setup
β”œβ”€β”€ main.py             # FastAPI backend
β”œβ”€β”€ models/             # Model definitions
β”œβ”€β”€ prompt_library/     # System prompts
β”œβ”€β”€ tools/              # LangChain tool wrappers
β”œβ”€β”€ utils/              # Config loader, web search, cache
β”œβ”€β”€ Dockerfile          # Multi-stage Docker build
└── requirements.txt    # Python dependencies

Contributing

See CONTRIBUTING.md for guidelines.

License

MIT -- see LICENSE for details.

About

Autonomous AI research agent using LangGraph to eliminate LLM hallucinations via a Generate-Critique-Refine self-reflection loop.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors