Skip to content

bug/unstructured.paddleocr is not compatible with GPU version of PaddleOCR #3191

Open
@peixin-lin

Description

@peixin-lin

I have got the following error when setting the OCR agent to Paddle and loading a GPU model.

     | During handling of the above exception, another exception occurred:

    | 

    |     if not paddle.fluid.core.is_compiled_with_rocm():

    |   File "/usr/local/lib/python3.9/site-packages/unstructured_paddleocr/paddle_tools/infer/utility.py", line 314, in get_infer_gpuid

    | AttributeError: module 'paddle' has no attribute 'fluid'

    |     return cls.get_instance(ocr_agent_cls_qname)

    |   File "/usr/local/lib/python3.9/site-packages/unstructured/partition/utils/ocr_models/ocr_interface.py", line 49, in get_instance

    |   File "/usr/local/lib/python3.9/site-packages/unstructured/partition/utils/ocr_models/ocr_interface.py", line 35, in get_agent

which finally leads to the following error:

    |   File "/usr/local/lib/python3.9/site-packages/unstructured/partition/pdf.py", line 802, in _partition_pdf_or_image_with_ocr_from_image

    |     ocr_agent = OCRAgent.get_agent()

    |     page_elements = _partition_pdf_or_image_with_ocr_from_image(

    +------------------------------------

    | ValueError: Environment variable OCR_AGENT must be set to an existing OCR agent module, not unstructured.partition.utils.ocr_models.paddle_ocr.OCRAgentPaddle.

I think the problem could be possibly solved by changing the line
if not paddle.fluid.core.is_compiled_with_rocm():
to
if not paddle.core.is_compiled_with_rocm():.
(line 314 in unstructured_paddleocr/paddle_tools/infer/utility.py)

My dependencies:

unstructured             0.14.5
unstructured-client      0.23.3
unstructured-inference   0.7.34
unstructured.paddleocr   2.6.1.3
unstructured.pytesseract 0.3.12
paddleclas               2.5.2
paddleocr                2.7.3
paddlepaddle             2.6.1
paddlepaddle-gpu         2.6.1.post112

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingocrRelated to optical character recognition (OCR).

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions