Does markitdown support scanned PDF's? #1361

mindphil · 2025-07-18T15:21:39Z

mindphil
Jul 18, 2025

I'm working on a program that rename's documents based on the files metadata, as well as context extracted from PDF's. PyPDF works well enough for Pnative PDF's but struggles with anything scanned. How does this handle scanned documents?

cuicheng01 · 2025-09-15T03:33:58Z

cuicheng01
Sep 15, 2025

Perhaps PaddleOCR is a better choice, as it can achieve state-of-the-art accuracy in the analysis of scanned PDFs. https://www.paddleocr.ai/latest/en/version3.x/algorithm/PP-StructureV3/PP-StructureV3.html

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Does markitdown support scanned PDF's? #1361

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Does markitdown support scanned PDF's? #1361

Uh oh!

mindphil Jul 18, 2025

Replies: 1 comment

Uh oh!

cuicheng01 Sep 15, 2025

mindphil
Jul 18, 2025

cuicheng01
Sep 15, 2025