Skip to content

[onert] Simplify NPU execution with blocking runNPU_model#16413

Merged
hseok-oh merged 1 commit intoSamsung:masterfrom
batcheu:remove_threading_workaround
Mar 4, 2026
Merged

[onert] Simplify NPU execution with blocking runNPU_model#16413
hseok-oh merged 1 commit intoSamsung:masterfrom
batcheu:remove_threading_workaround

Conversation

@batcheu
Copy link
Copy Markdown
Contributor

@batcheu batcheu commented Feb 27, 2026

Replace multi-step NPU request workflow with single blocking API call.
This change removes the complex threading mechanism used to handle submitNPU_request timeout and thread-safety issues.

ONE-DCO-1.0-Signed-off-by: Jonghwa Lee jonghwa3.lee@samsung.com

Replace multi-step NPU request workflow with single blocking API call.
This change removes the complex threading mechanism used to handle
submitNPU_request timeout and thread-safety issues.

ONE-DCO-1.0-Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
@batcheu batcheu marked this pull request as draft February 27, 2026 06:20
@batcheu
Copy link
Copy Markdown
Contributor Author

batcheu commented Feb 27, 2026

The workaround is no longer required for DevContext because a patch that solves the timeout issue is applied to the trix-engine.

@batcheu
Copy link
Copy Markdown
Contributor Author

batcheu commented Mar 3, 2026

It can optimize model inference time by removing this workaround (~200 us on Cortex-A72 @1.8GHz)

onert_run execution time comparision

  • Single Batch
    • Before : 2.569 ms

      raw data
      $ onert_run -r 10 model.circle
      ===================================
      MODEL_LOAD   takes 2.990 ms
      PREPARE      takes 16.813 ms
      EXECUTE      takes 2.569 ms
      - MEAN     :  2.569 ms
      - MAX      :  3.555 ms
      - MIN      :  2.395 ms
      - GEOMEAN  :  2.551 ms
      ===================================
      
    • After : 2.365 ms

      raw data
      $ onert_run -r 10 model.circle
      ===================================
      MODEL_LOAD   takes 2.982 ms
      PREPARE      takes 16.882 ms
      EXECUTE      takes 2.365 ms
      - MEAN     :  2.365 ms
      - MAX      :  3.302 ms
      - MIN      :  2.179 ms
      - GEOMEAN  :  2.347 ms
      ===================================
      
  • N-Batch (N : 4)
    • Before : 5.762 ms

      raw data
      $ onert_run -r 10 model.batch_4.circle
      ===================================
      MODEL_LOAD   takes 3.012 ms
      PREPARE      takes 17.089 ms
      EXECUTE      takes 5.762 ms
      - MEAN     :  5.762 ms
      - MAX      :  7.107 ms
      - MIN      :  5.536 ms
      - GEOMEAN  :  5.746 ms
      ===================================
      
    • After : 5.432 ms

      raw data
      $ onert_run -r 10 model.batch_4.circle
      ===================================
      MODEL_LOAD   takes 2.997 ms
      PREPARE      takes 16.888 ms
      EXECUTE      takes 5.432 ms
      - MEAN     :  5.432 ms
      - MAX      :  6.688 ms
      - MIN      :  5.209 ms
      - GEOMEAN  :  5.418 ms
      ===================================
      

@batcheu batcheu marked this pull request as ready for review March 3, 2026 04:39
@batcheu batcheu requested a review from ragmani March 3, 2026 04:44
@ragmani
Copy link
Copy Markdown
Contributor

ragmani commented Mar 4, 2026

Sorry for the late reply.
I'd like to confirm whether the hanging issue has been resolved. Do these changes ensure that threads no longer become unresponsive? In other words, is there no longer any need for the previous workaround of allowing hanging threads?

@batcheu
Copy link
Copy Markdown
Contributor Author

batcheu commented Mar 4, 2026

Sorry for the late reply. I'd like to confirm whether the hanging issue has been resolved. Do these changes ensure that threads no longer become unresponsive? In other words, is there no longer any need for the previous workaround of allowing hanging threads?

Yes, there's no hanging issue with the latest version of trix-engine.

Copy link
Copy Markdown
Contributor

@ragmani ragmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@hseok-oh hseok-oh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hseok-oh hseok-oh merged commit 70d7c57 into Samsung:master Mar 4, 2026
10 checks passed
@batcheu batcheu deleted the remove_threading_workaround branch March 5, 2026 06:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants