Skip to content

hal-29/semantic-search-engine-app

Repository files navigation

Semantic Search Engine (SSE)

A FastAPI-powered semantic search application with React frontend, featuring document uploads, vector embeddings, and LLM-powered question answering.

✨ Features

  • Document Processing: Upload PDFs, Word docs, and text files
  • Semantic Search: Find relevant content using vector embeddings
  • RAG Integration: Generate AI-powered answers from your documents
  • Session Management: Isolate document sets per session
  • Modern Stack: FastAPI + React + FAISS + HuggingFace models
Feature Description
Multi-format Support PDF, Word, Text files
Hybrid Search Semantic + keyword search
RAG Integration Context-aware answers
Performance Optimized chunking & indexing

🛠️ Installation

Docker runner

docker compose up --build
# Accsess the app at http:localhost:8000

Development Setup

   # Linux / Mac
   python -m venv venv
   source venv/bin/activate 

   # Windows
   venv\Scripts\activate

   # run
   pip install -r requirements.txt
   uvicorn main:app --reload

Frontend

   cd frontend
   pnpm install
   pnpm run dev

API Endpoints

Endpoint Method Description
/api/session GET Create a new session
/api/upload POST Upload files and resource
/api/search GET Perform semantic search

How the app operates?

  1. You Open the App

    The app creates a unique session ID for you

    Behind the scenes:
    → Generates a random ID
    → Creates an empty folder named after that ID to store your files.

  2. You Upload Files

    You drag/drop PDFs, Word docs, or text files into the app.

    Behind the scenes:
    → Saves files to your session folder.
    → Breaks each file into small text chunks (e.g., 1-2 sentences each).
    → Stores these chunks in a list with metadata (file name, page number, etc.).

  3. The App "Understands" Your Files

    The app converts every text chunk into number sequences (vectors) using AI.

    Behind the scenes:
    → Uses a pre-trained model (all-MiniLM-L6-v2) to generate vectors.
    → Builds a searchable index using FAISS (Facebook’s search tool).

  4. You Search for Something

    Behind the scenes:
    → Converts your query into a barcode (vector) using the same AI model.
    → Compares it against all document barcodes to find the closest matches.
    → Reranks results using a second AI (cross-encoder) to prioritize relevance.

  5. You Get Results

    The app shows you:
    -> Direct excerpts from your documents (sorted by relevance).
    -> AI-generated summary (if RAG is enabled), combining the top matches into a natural answer.

Simple Analogy

I. Session ID = Your private locker.

II. File Upload = Putting books into the locker.

III. Chunking = Tearing out pages and highlighting paragraphs.

IV. Vectors = Giving each paragraph a unique ID number.

V. Search = Finding paragraphs with matching ID numbers to your question.

VI. AI Answer = A friend (the LLM) reads those paragraphs and explains the answer to you.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published