[NNPA] Do not lower ONNX ops whose inputs are scalar to ZHigh#3393
[NNPA] Do not lower ONNX ops whose inputs are scalar to ZHigh#3393
Conversation
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
AlexandreEichenberger
left a comment
There was a problem hiding this comment.
@tungld don't you think we need a more generic solution than just checking the scalar size for 2 operations? I.e. I would extend it to all ops. Maybe in the "isLegal"...
|
I made a new PR #3396 that has just the update of the cost/perf model (not changing the default), and it has more support to respect CPU/NNPA decisions when transforming ONNX to NNPA, as there were rules that ignored CPU device placement. I think you can then update the |
Signed-off-by: Tung D. Le <tung@jp.ibm.com>
This PR indeed supports all ops. I just added lit tests for all ops to show it clearly (I didn't add scalar tests for matmul, rnn, lstm, gru, conv though the code supports but I don't think we have matmul with scalar inputs). |
|
@tungld what I ment is that in OnnxToZHIgh.td, there are many more patterns that re-introduce ZHigh ops than just the sqrt and inv patterns. I added a mechanism for all of them in my PR already. If we continue using the existing pattern, namely to place non-beneficial (aka here scalar) ops by labeling them with the |
This PR is to not lower ONNX ops whose inputs are scalar to ZHigh so that they run on CPU. To add more conditions rather than "scalar", there only needs to update the function
isTooSmallOpForNNPAwith more conditions.