Open
Description
In the ocr_interface.py
file, it would be nice if the code handles the importlib.import_module(module_name)
in the get_instance(...)
function
@staticmethod
@functools.lru_cache(maxsize=None)
def get_instance(ocr_agent_module: str) -> "OCRAgent":
module_name, class_name = ocr_agent_module.rsplit(".", 1)
if module_name in OCR_AGENT_MODULES_WHITELIST:
module = importlib.import_module(module_name)
loaded_class = getattr(module, class_name)
return loaded_class()
else:
raise ValueError(
f"Environment variable OCR_AGENT module name {module_name}, must be set to a"
f" whitelisted module part of {OCR_AGENT_MODULES_WHITELIST}.",
)
I was so confused when I keep getting this error from the get_agent(...)
function
ValueError: Environment variable OCR_AGENT must be set to an existing OCR agent module, not unstructured.partition.utils.ocr_models.tesseract_ocr.OCRAgentTesseract.
when after hours of digging it turns out I just haven't installed pandas lol🗿