Skip to content

[onert] Performance Regression: Latency Increase & Memory Reduction with MobileNetV2 #15362

@ragmani

Description

@ragmani

Description

We previously saw the following results for MobileNetV2 on CPU:
Prepare : 10.193 ms
Avg I/O : 0.081 ms
Avg Run : 10.520 ms
RSS EXECUTE : 72 160 KB

Today’s results show:
Prepare : 10.606 ms
Avg I/O : 0.084 ms
Avg Run : 21.962 ms
RSS EXECUTE : 37 484 KB

Measurement Reference

Steps to Reproduce

  1. Run
python3 runtime/onert/sample/minimal-python/inference_benchmark.py mobilenetv2 --backends cpu --input-shape 1,224,224,3 --repeat 100
  1. Compare the “Avg Run” and “EXECUTE RSS” values to previous runs

Expected Behavior

  • Avg Run should remain close to ~10.5 ms
  • RSS EXECUTE should remain close to ~72 MB

Actual Behavior

  • Avg Run has nearly doubled to ~22 ms
  • RSS EXECUTE has dropped by roughly 50%

Impact

This regression could indicate a change in memory allocation strategy or an unintended slowdown in the execution path. It needs investigation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions