[onert/python] Support inference benchmark by ragmani · Pull Request #15192 · Samsung/ONE

ragmani · 2025-04-21T05:39:12Z

This commit adds inference benchmark sample and latency measurement support.

Added inference_benchmark.py sample for measuring inference latency and memory usage
- Supports static shape override via --input-shape
- Measures MODEL_LOAD / PREPARE / EXECUTE / PEAK memory (RSS) and I/O / run latency
Updated session.infer() API to optionally return latency metrics using measure=True
Fixed potential memory accumulation in set_inputs() and set_outputs() by clearing internal buffers each call
Added _time_block() context manager for clean latency measurement implementation

ONE-DCO-1.0-Signed-off-by: ragmani ragmani0216@gmail.com

This commit adds inference benchmark sample and latency measurement support. - Added `inference_benchmark.py` sample for measuring inference latency and memory usage - Supports static shape override via `--input-shape` - Measures MODEL_LOAD / PREPARE / EXECUTE / PEAK memory (RSS) and I/O / run latency - Updated `session.infer()` API to optionally return latency metrics using `measure=True` - Fixed potential memory accumulation in `set_inputs()` and `set_outputs()` by clearing internal buffers each call - Added `_time_block()` context manager for clean latency measurement implementation ONE-DCO-1.0-Signed-off-by: ragmani <ragmani0216@gmail.com>

ragmani · 2025-04-21T05:39:39Z

For #15172
Draft #15176

ragmani · 2025-04-21T06:14:51Z

python3 runtime/onert/sample/minimal-python/src/inference_benchmark.py mobilenetv2 --backends cpu --input-shape 1,224,224,3 --repeat 100
======= Inference Benchmark =======
- Warmup runs   : 3
- Measured runs : 100
- Prepare       : 10.193 ms
- Avg I/O       : 0.081 ms
- Avg Run       : 10.520 ms
===================================
RSS
- MODEL_LOAD    : 15112 KB
- PREPARE       : 42280 KB
- EXECUTE       : 72160 KB
- PEAK          : 72160 KB
===================================

ragmani · 2025-04-21T08:56:27Z

python3 runtime/onert/sample/minimal-python/src/inference_benchmark.py mobilenetv2 --backends cpu --input-shape 1,224,224,3 --repeat 10
======= Inference Benchmark =======
- Warmup runs   : 3
- Measured runs : 10
- Prepare       : 10.814 ms
- Avg I/O       : 0.082 ms
- Avg Run       : 10.831 ms
===================================
RSS
- MODEL_LOAD    : 15068 KB
- PREPARE       : 41728 KB
- EXECUTE       : 41856 KB
- PEAK          : 41856 KB
===================================

Note that the above RSS figures exclude the memory footprint of both ONERT library and the Python libraries, ONERT library itself consumes roughly 3 MB.

ragmani · 2025-04-22T11:36:32Z

@Samsung/one_onert PTAL

glistening

LGTM

glistening · 2025-04-23T01:20:13Z

@ragmani Could you please update typo in title?

ragmani added the PR/ready for review It is ready to review. Please review it. label Apr 21, 2025

ragmani requested a review from a team April 21, 2025 05:39

chunseoklee reviewed Apr 21, 2025

View reviewed changes

Comment thread runtime/onert/sample/minimal-python/src/inference_benchmark.py Outdated

Fix execution time

10db1c4

chunseoklee approved these changes Apr 21, 2025

View reviewed changes

glistening approved these changes Apr 23, 2025

View reviewed changes

glistening merged commit c51e1c6 into Samsung:master Apr 23, 2025
10 checks passed

ragmani changed the title ~~[onert/python] Support infernece benchmark~~ [onert/python] Support inference benchmark Apr 23, 2025

ragmani mentioned this pull request Apr 23, 2025

[onert] Enhance the Python inference API #15172

Closed

ragmani mentioned this pull request May 13, 2025

[onert] Performance Regression: Latency Increase & Memory Reduction with MobileNetV2 #15362

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[onert/python] Support inference benchmark#15192

[onert/python] Support inference benchmark#15192
glistening merged 2 commits intoSamsung:masterfrom
ragmani:onert/python/infer_benchmark

ragmani commented Apr 21, 2025

Uh oh!

ragmani commented Apr 21, 2025

Uh oh!

ragmani commented Apr 21, 2025

Uh oh!

Uh oh!

ragmani commented Apr 21, 2025 •

edited

Loading

Uh oh!

ragmani commented Apr 22, 2025

Uh oh!

glistening left a comment

Uh oh!

glistening commented Apr 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ragmani commented Apr 21, 2025

Uh oh!

ragmani commented Apr 21, 2025

Uh oh!

ragmani commented Apr 21, 2025

Uh oh!

Uh oh!

ragmani commented Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ragmani commented Apr 22, 2025

Uh oh!

glistening left a comment

Choose a reason for hiding this comment

Uh oh!

glistening commented Apr 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ragmani commented Apr 21, 2025 •

edited

Loading