Skip to content

Hybrid Multimodal RAG - Streamlit - Hybrid Retrieval - Knowledge base intelligent agent

License

Notifications You must be signed in to change notification settings

pavithra2870/SearchBot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hybrid Multimodal RAG System

A Streamlit-based multimodal Retrieval-Augmented Generation (RAG) app that understands text + images inside documents and answers questions with context, citations, and memory.

Built for when PDFs stop being searchable and start being annoying.

What This Does

  • Upload PDF / DOCX / TXT files
  • Extracts:
    • Text
    • Images inside PDFs
    • OCR text from images (multilingual)
  • Indexes everything using hybrid search:
    • Semantic (vector embeddings)
    • Keyword (BM25)
  • Lets you chat with your documents
  • Retrieves relevant images + text together
  • Maintains chat history across sessions

Basically: your documents, but smarter and less silent.

Core Features

  • Multimodal RAG
    Text + OCR + image captions all live in the same retrieval pipeline.

  • Hybrid Retrieval
    Combines semantic search (FAISS) with keyword search (BM25) using weighted ensembling.

  • Adaptive Chunking
    Chunk size adjusts automatically based on document length.

  • Multilingual OCR
    EasyOCR with dynamic language selection (English, Hindi, Tamil, etc.).

  • History-Aware QA
    Follow-up questions actually understand past context.

  • Image-Aware Answers
    Retrieved images are surfaced alongside responses when relevant.

Tech Stack

  • Frontend: Streamlit
  • LLM: LLaMA 3 (70B) via Groq
  • Embeddings: Sentence Transformers (MiniLM)
  • Vector Store: FAISS
  • Retrieval: BM25 + Semantic Ensemble
  • OCR: EasyOCR
  • PDF Processing: PyMuPDF
  • Framework: LangChain

How It Works (High Level)

  1. Documents are uploaded and parsed
  2. Text is chunked and embedded
  3. Images are extracted and OCR’d
  4. Everything is indexed together
  5. Queries run through hybrid retrieval
  6. LLM answers using retrieved context + chat history

No magic. Just well-orchestrated components doing their job.

Setup

Create a Virtual Environment:

python -m venv venv

Activate it:

venv\Scripts\activate

Install Dependencies:

pip install -r requirements.txt

Set your Groq API key:

GROQ_API_KEY=your_key_here

Run the app:

streamlit run app.py

About

Hybrid Multimodal RAG - Streamlit - Hybrid Retrieval - Knowledge base intelligent agent

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages