Skip to content

fix: avoid crashes on OCR JSON error payloads#1880

Open
DhruvrajSinhZala24 wants to merge 1 commit intoopenfoodfacts:mainfrom
DhruvrajSinhZala24:fix/1752-handle-ocr-resource-exhausted
Open

fix: avoid crashes on OCR JSON error payloads#1880
DhruvrajSinhZala24 wants to merge 1 commit intoopenfoodfacts:mainfrom
DhruvrajSinhZala24:fix/1752-handle-ocr-resource-exhausted

Conversation

@DhruvrajSinhZala24
Copy link
Copy Markdown

Problem

Robotoff can occasionally receive OCR JSON payloads that contain an error object (e.g. quota exhaustion). In those cases, some OCR-dependent code paths can raise and crash the job instead of treating OCR as unavailable.

Fixes #1752

Root cause

openfoodfacts.ocr.OCRResult.from_json raises OCRParsingException when responses[0].error is present. Some Robotoff call sites relied on OCR fetching/decoding paths that could propagate these exceptions (notably via URL-based OCR inputs), causing worker jobs to fail.

Changes

  • Add robotoff.utils.ocr.get_ocr_result_from_url to consistently handle OCR download/parsing failures:
    • return None when error_raise=False
    • preserve raising behavior when error_raise=True
  • Use the helper in OCR consumers to avoid crashing on error payloads.
  • Update the ingredient extraction worker path to fetch OCR defensively and pass an OCRResult to the model code, so OCR failures are handled as a non-fatal “missing OCR” case.

Behavior

  • Valid OCR payloads: unchanged.
  • OCR payloads with responses[0].error (or other non-fatal OCR fetch/parse failures): the job logs and exits early instead of crashing.

Tests

  • pytest -q tests/unit/utils/test_ocr.py

Avoid crashing OCR-dependent jobs when Product Opener returns an OCR JSON containing an error payload (e.g. quota exhaustion).

Adds a small helper around OCR fetching and a regression test.\n\nRefs: openfoodfacts#1752
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment