Runpod EasyOCR (PDF)

Serverless OCR for PDF files using EasyOCR on Runpod. Provide public PDF URLs and get extracted text with bounding boxes (normalized to 0..1) and confidence per page. Images are auto-rotated using Tesseract OSD before OCR.

Input schema

pdf_url or pdf_urls: Single URL string or list of public PDF URLs.
languages: List of language codes for EasyOCR (default ['ch_sim','en']).
gpu: Boolean to use GPU if available (default true).
detail: 1 for boxes, text, confidence; 0 for text only (default 1).
dpi: Rendering DPI for PDF to image (default 200).
page_indices: List of zero-based page indices to process.
page_from/page_to: Page range (inclusive) to process.
page_limit: Max number of pages to process.
batched: Use EasyOCR readtext_batched to process pages in a batch (default false).
n_width/n_height: When batching, resize all pages to these exact dimensions. If not provided and pages differ in size, the largest width/height are used.
cudnn_benchmark: Enables cuDNN benchmark mode for consistent batch sizes (default false).

Orientation: The worker uses Tesseract OSD to detect and correct page orientation prior to OCR. This helps with multilingual documents (Thai, Chinese, English). Minimal orientation metadata is attached per page.

Example request body

{
  "pdf_urls": [
    "https://arxiv.org/pdf/1708.01204.pdf"
  ],
  "languages": ["ch_sim", "en"],
  "gpu": true,
  "detail": 1,
  "dpi": 200,
  "page_limit": 1,
  "batched": true,
  "n_width": 1200,
  "n_height": 1600,
  "cudnn_benchmark": true
}

Output shape

{
  "results": [
    {
      "url": "https://...",
      "pages": [
        {
          "index": 0,
          "results": [
            { "box": [[x,y],...], "text": "...", "confidence": 0.99 }
          ],
          "orientation": { "rotate": 0, "script": "Latin" }
        }
      ]
    }
  ],
  "languages": ["ch_sim","en"],
  "gpu": true,
  "detail": 1,
  "dpi": 200
}

Local testing

You can run the handler locally by setting INPUT_JSON and executing the file, or by using the Runpod testing CLI. This repo also includes .runpod/tests.json for Hub automated tests.

Deployment

Ensure Docker is available and build the image: docker build -t runpod-easyocr .
Push to your registry or connect the repo to Runpod Hub.
Create a release on GitHub to trigger Hub ingestion.

Notes

The worker uses PyMuPDF to render PDF pages to images, avoiding external system dependencies.
The EasyOCR Reader is cached between requests to avoid reloading weights.
Set default languages via env READER_LANGS (e.g., ch_sim,en).
Output box coordinates are normalized: x in [0.0,1.0] relative to image width and y in [0.0,1.0] relative to image height.
Pages are orientation-corrected using Tesseract OSD (pytesseract). The base image contains Tesseract and language packs for Thai, Simplified/Traditional Chinese, and English.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.runpod		.runpod
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
handler.py		handler.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Runpod EasyOCR (PDF)

Input schema

Example request body

Output shape

Local testing

Deployment

Notes

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Runpod EasyOCR (PDF)

Input schema

Example request body

Output shape

Local testing

Deployment

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages