-
-
Notifications
You must be signed in to change notification settings - Fork 116
Open
Description
Hi,
As I understand it, this project links each text block to a layout type (e.g. text, title, header, etc.), which means it depends on the OCR output to correctly segment the text into blocks.
In my case, I’m using a VGT model that correctly detects two separate phrases — I can clearly see two distinct “text” boxes in the PDF visualization.
However, the OCR merged both phrases into a single block, so in the final analysis I only get one text block labeled as “text.”
Is my understanding correct?
And if so, do you have any suggestions on how to extract the text that corresponds to the visually detected layout without relying on the OCR-defined text boxes?
Thanks in advance!
Metadata
Metadata
Assignees
Labels
No labels