-
Notifications
You must be signed in to change notification settings - Fork 316
Description
Description
Several KerasHub models fail to convert properly to TensorFlow Lite, producing dramatically different outputs (up to 1000% difference) while their tests pass due to excessively loose validation thresholds.
Affected Models
- D-Fine Object Detector - 500% difference in intermediate predictions
- RetinaNet Object Detector - 500% difference in box predictions
- DeepLabV3 Segmenter - 60% difference in segmentation masks
- SegFormer Segmenter - 1000% difference (completely broken)
Steps to Reproduce
-
Run the LiteRT export test for D-Fine (it passes due to loose thresholds):
pytest keras_hub/src/models/d_fine/d_fine_object_detector_test.py::DFineObjectDetectorTest::test_litert_export -v --run_large
-
Examine the test thresholds in the code - they allow up to 500% difference:
output_thresholds={ "intermediate_predicted_corners": {"max": 5.0, "mean": 0.05}, # 500% allowed! "intermediate_logits": {"max": 5.0, "mean": 0.1}, # 500% allowed! # ... etc }
-
To demonstrate the issue: temporarily change thresholds to reasonable values (< 0.01) and run the test - it will fail showing the actual conversion problems.
Expected Behavior
LiteRT converted models should produce outputs within reasonable tolerance (< 1-5%) of original TensorFlow models.
Actual Behavior
Models produce outputs with differences of 60-1000%, indicating fundamental conversion incompatibilities.
Additional Context
These models use complex operations and architectures that don't translate properly to TensorFlow Lite. The loose thresholds mask critical conversion failures that could affect production deployments.
Files to Investigate
keras_hub/src/models/d_fine/d_fine_object_detector.pykeras_hub/src/models/retinanet/retinanet_object_detector.pykeras_hub/src/models/deeplab_v3/deeplab_v3_segmenter.pykeras_hub/src/models/segformer/segformer_image_segmenter.py