We would like to cover more operators from the following sources: 1. Tilelang benchmark: https://github.com/tile-ai/tilelang/tree/main/examples, deepseek_deepgemm, deepseek_mla, linear_attention - [x] gemm_sm100 - [ ] deepseek_deepgemm - [ ] deepseek_mla - [x] linear_attention: https://github.com/meta-pytorch/tritonbench/pull/567/files 3. Mojo mha_sm100: https://github.com/modular/modular/blob/main/max/kernels/src/nn/mha_sm100.mojo 5. vLLM custom Triton kernels from MoE
We would like to cover more operators from the following sources:
Mojo mha_sm100: https://github.com/modular/modular/blob/main/max/kernels/src/nn/mha_sm100.mojo
vLLM custom Triton kernels from MoE