Skip to content

fix: avoid RapidOCR torch fallback in default OCR setup#3581

Open
zeukid wants to merge 1 commit into
docling-project:mainfrom
zeukid:main
Open

fix: avoid RapidOCR torch fallback in default OCR setup#3581
zeukid wants to merge 1 commit into
docling-project:mainfrom
zeukid:main

Conversation

@zeukid

@zeukid zeukid commented Jun 10, 2026

Copy link
Copy Markdown

Summary

Fixes the default Docling OCR setup to use RapidOCR with ONNXRuntime instead of accidentally falling back to the torch backend when torch is present.

The torch backend remains available when explicitly requested with RapidOcrOptions(backend="torch"), but Auto OCR no longer selects it implicitly.

Issue resolved by this Pull Request:
Resolves #3580

Changes

  • Updated docling-slim[standard] to include feat-ocr-rapidocr-onnx.
  • Removed RapidOCR torch from Auto OCR fallback selection.
  • Added regression tests for Auto OCR backend selection.
  • Added packaging coverage to ensure the standard extra includes the ONNX RapidOCR path.
  • Regenerated uv.lock.

@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

DCO Check Passed

Thanks @zeukid, all your commits are properly signed off. 🎉

@mergify

mergify Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

…l use RapidOCR with ONNXRuntime instead of accidentally falling back to the torch backend when torch is present.

Signed-off-by: zeukid <62079465+zeukid@users.noreply.github.com>
@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@dolfim-ibm dolfim-ibm left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zeukid can you please explain the reason for this PR?

The behavior of Docling until now is to avoid installing onnxrutime when not explicitly requested via an extra. We have already torch as a default dependency and we would like to avoid a second one.

This the reason for the fallback in the OCR selection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can't run latest docling because RapidOCR can't find arch_config.yaml

2 participants