Open
Description
Describe the bug
Try to parse a pdf with OCR_AGENT=unstructured.partition.utils.ocr_models.google_vision_ocr.OCRAgentGoogleVision
.
To Reproduce
Provide a code snippet that reproduces the issue.
import os
os.environ[
"OCR_AGENT"] = "unstructured.partition.utils.ocr_models.google_vision_ocr.OCRAgentGoogleVision"
from unstructured.partition.pdf import partition_pdf
partition_pdf("fake-memo.pdf",
strategy="hi_res",
)
Expected behavior
No error
Environment Info
OS version: Linux-6.8.0-45-generic-x86_64-with-glibc2.39
Python version: 3.11.4
unstructured version: 0.15.14.dev1
unstructured-inference version: 0.7.36
pytesseract is not installed
Torch version: 2.4.1
Detectron2 is not installed
PaddleOCR version: None
Libmagic version: file-5.45
magic file from /etc/magic:/usr/share/misc/magic
Additional context
Add any other context about the problem here.