Welcome to OCRmyPDF Discussions! #1160

jbarlow83 · 2023-09-30T20:24:19Z

jbarlow83
Sep 30, 2023
Maintainer

👋 Welcome!

We’re using Discussions as a place to connect with other members of our community. We hope that you:

Ask questions you’re wondering about.
Share ideas.
Engage with other community members.
Welcome others and are open-minded. Remember that this is a community we
build together 💪.

To get started, comment below with an introduction of yourself and tell us about what you do with this community.

bikramnayak · 2025-04-02T11:38:01Z

bikramnayak
Apr 2, 2025

I’m working with a scanned PDF that contains a table with two columns, where each column has two lines of text. When I convert the scanned PDF using OCRmyPDF, I’m encountering an issue with the resulting content. Tesseract processes the text line by line, but this causes OCRmyPDF to generate separate spans for each piece of content. Specifically, it creates a span for row 1, cell 1, then another span for row 1, cell 2, followed by separate spans for row 2, cell 1, and row 2, cell 2. This results in accessibility problems for screen readers, as the content is not structured properly. Is there any way to resolve this issue and ensure the table is interpreted correctly by screen readers?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Welcome to OCRmyPDF Discussions! #1160

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Welcome to OCRmyPDF Discussions! #1160

jbarlow83 Sep 30, 2023 Maintainer

👋 Welcome!

Replies: 1 comment

bikramnayak Apr 2, 2025

jbarlow83
Sep 30, 2023
Maintainer

bikramnayak
Apr 2, 2025