You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### Description
Implementation of special sgemm path which uses GEMV kernels in cases
where M or N are 1
Additionally this pr introduces the usage of a microkernel interface
which utilizes typedef's provided by KleidiAI such that we can simplify
the code and remove things such as ternary operations for SME1 vs SME2
kernels
### Indicative Performance
In Lieu of any production models where gemv was a large contributor of
the network. I opted to create a mini model to test which contains
thousands of randomized matmul variants. With a distribution of GEMV
cases throughout
<img width="1572" height="148" alt="image (6)"
src="https://github.com/user-attachments/assets/451441e4-df5b-42d1-8c6e-ec8dd14161e6"
/>
Using onnxruntime perf test I was able to half the total inference time
vs mlas with this model
<img width="1200" height="900"
alt="ort_ops_compare_gemv_no_2025-10-07_19-40-30_vs_gemv_2025-10-07_19-40-58"
src="https://github.com/user-attachments/assets/ddef3bf3-796c-4f58-8712-361510e2a901"
/>
**_More Benchmarks to come shortly_**
---------
Signed-off-by: Jonathan Clohessy <Jonathan.Clohessy@arm.com>
Signed-off-by: Jonathan Clohessy <jonathan.clohessy@arm.com>
Co-authored-by: Hariharan Seshadri <shariharan91@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
0 commit comments