I think there's already an issue about this but I can't find it. Anyways, a couple more links about tensor cores. Links we already had: * https://siboehm.com/articles/22/Fast-MMM-on-CPU * https://siboehm.com/articles/22/CUDA-MMM * https://seb-v.github.io/optimization/update/2025/01/20/Fast-GPU-Matrix-multiplication.html New links: * https://alexarmbr.github.io/2024/08/10/How-To-Write-A-Fast-Matrix-Multiplication-From-Scratch-With-Tensor-Cores.html * https://cudaforfun.substack.com/p/outperforming-cublas-on-h100-a-worklog
I think there's already an issue about this but I can't find it. Anyways, a couple more links about tensor cores.
Links we already had:
New links: