Skip to content

Latest commit

 

History

History
executable file
·
23 lines (15 loc) · 455 Bytes

File metadata and controls

executable file
·
23 lines (15 loc) · 455 Bytes

Warp Specialization HGemm

0x00 说明

包含以下内容:

  • ws_hgemm_naive_cute_kernel

测试

python3 ws_hgemm.py

输出:

--------------------------------------------------------------------------------
out_ws_hgemm_naive_cute: [4096.0, 4096.0, 4096.0], time:3.71974587ms
out_f16_th: [4096.0, 4096.0, 4096.0], time:5.05561471ms
--------------------------------------------------------------------------------