This is a small project I built to practice using FastAPI, OpenCV, and Tesseract OCR.
The app lets you upload a photo of a document (like class notes), fixes the perspective, makes it look like a clean scan, and then extracts the text. It also tries to auto-tag the document based on the words it finds.
At university it’s common to take photos of whiteboards, slides, or notes, but they’re often messy to read later. I wanted a tool that could clean them up and make the text searchable.
- Upload an image and get back a cleaned “scanned” version
- Extract text using Tesseract OCR
- Auto-generate simple tags from the text
- Basic web interface built with HTML + FastAPI backend
- Python (FastAPI, OpenCV, scikit-learn, pytesseract)
- Frontend: plain HTML/JS (no framework, kept simple)
- OCR: Tesseract
- Install Tesseract OCR.
- Clone this repo: powershell git clone https://github.com//smart-doc-scanner.git cd smart-doc-scanner
- Create a virtual environment: py -m venv .venv ..venv\Scripts\Activate
- Install requirements: pip install -r requirements.txt
- Run the server: uvicorn app.main:app --reload
- Open http://127.0.0.1:8000 in web browser