This FastAPI-based tool performs Optical Character Recognition (OCR) on images, allowing conversion to text or PDF.
- Img2Text: Reads text from an uploaded image.
- Img2pdf: Converts an image to a PDF with the extracted text.
Before running the project, make sure you have the following installed on your system:
- Python (version 3.6 or higher)
- Tesseract OCR
-
Download Tesseract OCR for Windows from https://community.chocolatey.org/packages/tesseract-ocr#files. Choose the
tesseract-ocr-w64-setup
package. -
Install Tesseract OCR by following the installation instructions provided on the download page.
-
Add Tesseract to your system PATH:
- Open the Control Panel.
- Click on "System and Security."
- Click on "System."
- Click on "Advanced system settings" on the left.
- Click on the "Environment Variables" button.
- Under "System variables," find and select the "Path" variable, then click on "Edit."
- Click on "New" and add the path to the Tesseract installation directory (usually
C:\Program Files\Tesseract-OCR
).
-
Open a terminal and navigate to the project directory.
-
Create a virtual environment (optional but recommended):
python -m venv venv
-
Activate the virtual environment:
-
On Windows:
.\venv\Scripts\activate
-
On Linux/macOS:
source venv/bin/activate
-
-
Install the required Python packages:
pip install -r requirements.txt
Once the installation is complete, you can run the FastAPI application using the following command:
uvicorn main:app --reload
Visit http://127.0.0.1/docs in your browser to access the FastAPI Swagger documentation and test the OCR functionality.
Follow the API documentation to interact with the OCR endpoint and extract text from images.
This project is licensed under the MIT License. Feel free to use and modify as needed.