Skip to content

Utilizes the capabilities of Tesseract OCR and improves quality with preprocessing of the images using opencv and other image tools, libraries.

Notifications You must be signed in to change notification settings

xadityax/TOCR-Pdf-Img-to-Txt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

TOCR-Pdf-Img-to-Txt

Utilizes the capabilities of Tesseract OCR and improves quality with preprocessing of the images using opencv and other image tools, libraries.

Setup -

  1. Install tesseract and replace the executable file path in the code.

  2. Run the code.

  3. Provide address of the folder which holds the images.

  4. Depending on need, you can get text file from images or a pdf with text.

Disclaimer : Always remember that software based OCR is often not as accurate as done with other instruments. This is an alternate in case you have to write an entire document into word or pdf and want to save time or you want to query the document after converting to txt.

About

Utilizes the capabilities of Tesseract OCR and improves quality with preprocessing of the images using opencv and other image tools, libraries.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages