Skip to content

Marshall-District-Library/mdl-tesseract-python-windows

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 

Repository files navigation

πŸ“„ MDL Tesseract - OCR PDF Processor

MDL Tesseract is a simple tool that converts scanned PDFs into searchable PDFs using Tesseract OCR.

πŸ”Ή Features

  • βœ… Automatically extracts text from scanned PDFs.
  • βœ… No extra setup required – Tesseract OCR is included.
  • βœ… Simple, user-friendly interface for staff and non-tech users.
  • βœ… Tracks progress as it processes multi-page documents.
  • βœ… Saves the searchable PDF with "OCR_" prepended to the original filename.

πŸ“₯ Installation Guide

  1. Go to the latest release:
    πŸ”— Click Here
  2. Download the installer (MDL_Tesseract.exe).
  3. Run the installer and follow the steps.
  4. Once installed, a shortcut will be added to:
    • Start Menu
    • Desktop

πŸ–₯️ How to Use

  1. Open MDL Tesseract.
  2. Click "Browse" and select a PDF file.
  3. Click "Browse" to choose an output folder.
  4. Click "Start OCR" to begin processing.
  5. The searchable PDF will be saved in your selected folder, prefixed with "OCR_".

πŸ”§ Troubleshooting

  • If you see a Windows Defender warning, click "More Info" β†’ "Run Anyway".
    (This happens because the app is not signed yet.)
  • If OCR fails, ensure the PDF is a scanned document (not already text-based).

πŸ’‘ Support

For questions or feedback, please open an issue in the repository.


πŸš€ MDL Tesseract - Making PDFs Searchable in Seconds!

About

Simple Windows OCR application utilizing open source Tesseract OCR.

Resources

Stars

Watchers

Forks

Packages

No packages published