Skip to content

Natarajan-R/Summarize-and-Chat-with-PDF

Repository files navigation

Summarize-and-Chat-with-PDF

AI-Powered Document Processing

Screenshot from 2025-08-11 13-06-44

A web application that uses locally hosted LLMs to summarize PDF documents and enable interactive Q&A about their content. Built with Python, Flask, and Socket.IO for real-time updates.

Features

  • PDF Summarization: Generate comprehensive, executive, technical, or bullet-point summaries
  • Document Q&A: Ask questions about PDF content and get AI-powered answers
  • Local Processing: Works with locally hosted LLMs (like Mistral) for privacy
  • Session Management: Save and revisit document processing sessions
  • Real-time Progress: Track processing with live updates
  • Statistics Dashboard: View document metrics and compression ratios

Technology Stack

  • Backend: Python, Flask, Socket.IO
  • Frontend: HTML5, CSS3, JavaScript
  • AI Processing: Local LLM integration (Ollama compatible)
  • Database: SQLite for session storage
  • Text Processing: pdfplumber, FAISS for vector search

Getting Started

Prerequisites

  • Python 3.12+
  • Ollama with Mistral (or other local LLM)
  • Node.js (for Socket.IO client)

Installation

git clone https://github.com/Natarajan-R/Summarize-and-Chat-with-PDF.git
cd Summarize-and-Chat-with-PDF
pip install -r requirements.txt

Running

python app.py

Open http://localhost:5000 in your browser

Configuration

Edit app.py to:

Change the model name (mistral by default)

Adjust chunking parameters

License

MIT

About

A Python project for summarizing and chatting with PDF documents using a locally hosted LLM.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors