Skip to content

refactor(object-detection): vectorize postprocessing#1884

Open
DhruvrajSinhZala24 wants to merge 1 commit intoopenfoodfacts:mainfrom
DhruvrajSinhZala24:fix/1706-optimize-object-detection-postprocess
Open

refactor(object-detection): vectorize postprocessing#1884
DhruvrajSinhZala24 wants to merge 1 commit intoopenfoodfacts:mainfrom
DhruvrajSinhZala24:fix/1706-optimize-object-detection-postprocess

Conversation

@DhruvrajSinhZala24
Copy link
Copy Markdown

Summary

This PR improves the object detection hot path by reducing Python-side postprocessing overhead in Robotoff's wrapper around the upstream detector.

The original #1706 report pointed at ML performance issues. On current code, I could not reproduce the historical preprocessing slowdown from the issue text as-is, but I did confirm that object detection postprocessing is still loop-heavy and measurable on the latest code. This change targets that current bottleneck.

What Changed

  • added an OptimizedObjectDetector in Robotoff's object detection wrapper
  • vectorized class, score, and bounding-box extraction from Triton outputs
  • cached the Albumentations preprocessing transform per detector instance
  • kept NMS behavior and timing metrics unchanged
  • lazy-loaded visualization utilities so the plotting stack is only imported when output_image=True
  • updated RemoteModel to reuse the optimized detector instance
  • added unit tests to verify parity with the upstream detector postprocess output

Why

The previous implementation still relied on Python loops over every candidate detection during postprocessing. For object detection outputs with many candidates, that adds avoidable CPU overhead on the Robotoff side.

This PR keeps the same output format and behavior while moving the expensive extraction work to NumPy.

Validation

Ran:

  • ./.venv/bin/pytest tests/unit/prediction/object_detection/test_core.py
  • ./.venv/bin/flake8 robotoff/prediction/object_detection/core.py tests/unit/prediction/object_detection/test_core.py

Local synthetic benchmark on an 8,400-candidate YOLO-shaped output:

  • postprocess without NMS: 6.43 ms -> 0.17 ms
  • postprocess with NMS: 6.83 ms -> 0.60 ms

Notes

  • This PR is intentionally scoped to Robotoff's wrapper layer.
  • I did not run the full ML integration test suite because that requires Triton and the models to be running locally.
  • If maintainers want, a follow-up could upstream the same optimization into the openfoodfacts Python package to avoid duplication.

@DhruvrajSinhZala24 DhruvrajSinhZala24 requested a review from a team as a code owner April 6, 2026 07:27
@DhruvrajSinhZala24 DhruvrajSinhZala24 changed the title perf(object-detection): vectorize postprocessing refactor(object-detection): vectorize postprocessing Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

1 participant