[NNPA] Do not lower ONNX ops whose inputs are scalar to ZHigh by tungld · Pull Request #3393 · onnx/onnx-mlir

tungld · 2026-02-13T05:56:19Z

This PR is to not lower ONNX ops whose inputs are scalar to ZHigh so that they run on CPU. To add more conditions rather than "scalar", there only needs to update the function isTooSmallOpForNNPA with more conditions.

Signed-off-by: Tung D. Le <tung@jp.ibm.com>

AlexandreEichenberger

@tungld don't you think we need a more generic solution than just checking the scalar size for 2 operations? I.e. I would extend it to all ops. Maybe in the "isLegal"...

AlexandreEichenberger · 2026-02-13T16:48:40Z

I made a new PR #3396 that has just the update of the cost/perf model (not changing the default), and it has more support to respect CPU/NNPA decisions when transforming ONNX to NNPA, as there were rules that ignored CPU device placement.

I think you can then update the QualifyingOps default policy by assigning scalar (or near scalar, eg with fewer than X data points) to CPU, and the lowering will then respect such decisions.

Signed-off-by: Tung D. Le <tung@jp.ibm.com>

tungld · 2026-02-16T00:52:32Z

@tungld don't you think we need a more generic solution than just checking the scalar size for 2 operations? I.e. I would extend it to all ops. Maybe in the "isLegal"...

This PR indeed supports all ops. I just added lit tests for all ops to show it clearly (I didn't add scalar tests for matmul, rnn, lstm, gru, conv though the code supports but I don't think we have matmul with scalar inputs).

AlexandreEichenberger · 2026-02-17T14:31:48Z

@tungld what I ment is that in OnnxToZHIgh.td, there are many more patterns that re-introduce ZHigh ops than just the sqrt and inv patterns. I added a mechanism for all of them in my PR already. If we continue using the existing pattern, namely to place non-beneficial (aka here scalar) ops by labeling them with the device=CPU, then my PR already cover these cases.

tungld added 5 commits February 12, 2026 23:27

init

013de6d

Signed-off-by: Tung D. Le <tung@jp.ibm.com>

add lit tests

c17c8ee

Signed-off-by: Tung D. Le <tung@jp.ibm.com>

comments

c433aeb

Signed-off-by: Tung D. Le <tung@jp.ibm.com>

add lit tests

e5dae9a

Signed-off-by: Tung D. Le <tung@jp.ibm.com>

Merge branch 'main' into scalar-op-not-to-nnpa

d55c507

tungld mentioned this pull request Feb 13, 2026

Enabling of the Performance model for NNPA / z17 #3383

Open

tungld requested a review from AlexandreEichenberger February 13, 2026 05:59

fix numerical test for leakyrelu

cba1f72

Signed-off-by: Tung D. Le <tung@jp.ibm.com>

AlexandreEichenberger reviewed Feb 13, 2026

View reviewed changes

Add lit tests for all supported ops

f32750e

Signed-off-by: Tung D. Le <tung@jp.ibm.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NNPA] Do not lower ONNX ops whose inputs are scalar to ZHigh#3393

[NNPA] Do not lower ONNX ops whose inputs are scalar to ZHigh#3393
tungld wants to merge 7 commits intoonnx:mainfrom
tungld:scalar-op-not-to-nnpa

tungld commented Feb 13, 2026

Uh oh!

AlexandreEichenberger left a comment

Uh oh!

AlexandreEichenberger commented Feb 13, 2026

Uh oh!

tungld commented Feb 16, 2026

Uh oh!

AlexandreEichenberger commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tungld commented Feb 13, 2026

Uh oh!

AlexandreEichenberger left a comment

Choose a reason for hiding this comment

Uh oh!

AlexandreEichenberger commented Feb 13, 2026

Uh oh!

tungld commented Feb 16, 2026

Uh oh!

AlexandreEichenberger commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants