Matmul core performance for [strix, phoenix] x [peano codegen, peano ukernel, chess ukernel]

### Benchmark target numbers 
First up some assumptions and theoretical limits. To do a 512x512x512 matmul on a single core 

phoenix, bf16, with clock at 1.6e9 (see https://github.com/nod-ai/iree-amd-aie/pull/1167#issuecomment-2719337890 and https://github.com/Xilinx/mlir-aie/issues/2017#issuecomment-2741214313)
`512*512*512 / (4*4*8 * 1.6e9)  = 655 microseconds`

strix, i8, with clock at 1e9:
`512*512*512 / (8*8*8 * 1.0e9)  = 262 microseconds`

To do the above 100 times (this is what the benchmark does, see [here](https://github.com/nod-ai/iree-amd-aie/blob/b1e81d052d259978c9049aa8143ceb0654367444/build_tools/ci/cpu_comparison/run.py#L2288))
phoenix : 65'500 [us]
strix: 26'200 [us]


### Benchmark performance: 

Looking at  https://nod-ai.github.io/iree-amd-aie/results_history_npu1.html


we see 
phoenix, direct codegen: 227'000 [us] : 29% of peak.
phoenix, direct codegen, with unroll and jam: 172'000 [us] : 39% of peak (see https://github.com/nod-ai/iree-amd-aie/pull/1167)
phoenix, chess ukernel: 152'000 [us]: 43% of peak

strix https://nod-ai.github.io/iree-amd-aie/results_history_npu4.html :
ukernel chess: 76'000 [us] :  34 %
ukernel peano: 97'000 [us] : 26 %



### Summary

- If my analysis is correct, microkernel performance needs improvement. Bf16 on phoenix should be better than 43% etc. 
- With unroll and jam on phoenix, direct codegen is within 10 % of ukernel performance, should also be improved. 





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matmul core performance for [strix, phoenix] x [peano codegen, peano ukernel, chess ukernel] #1198

Benchmark target numbers

Benchmark performance:

Summary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Matmul core performance for [strix, phoenix] x [peano codegen, peano ukernel, chess ukernel] #1198

Description

Benchmark target numbers

Benchmark performance:

Summary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions