Add SM120 Gluon flash attention (TMA + MMAv2) by blake-snc · Pull Request #8 · gpu-mode/triton-index

blake-snc · 2026-03-03T05:35:38Z

Summary

Adds index entry for SM120 Gluon flash attention kernel — first open-source Gluon FA targeting SM120 GPUs (DGX Spark, RTX 5090, GB10)
Uses TMA for all data movement + MMAv2 tensor cores (SM120 lacks tcgen05/WGMMA)
Supports BF16 and FP8 (E5M2), causal and non-causal
Kernel source submitted to triton-lang/kernels as PR #20
Also adds missing entry 15 (GPTQ Triton) to kernel_overview.md

Contributed by Second Nature Computing (https://joinsecondnature.com)

🤖 Generated with Claude Code

Index entry for the first open-source Gluon flash attention kernel targeting SM120 GPUs (DGX Spark, RTX 5090, GB10). Uses TMA for all data movement and MMAv2 tensor cores. Supports BF16 and FP8. Kernel source: triton-lang/kernels PR #20 Also fixes missing entry 15 (GPTQ) in kernel_overview.md. Contributed by Second Nature Computing (https://joinsecondnature.com) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SM120 Gluon flash attention (TMA + MMAv2)#8

Add SM120 Gluon flash attention (TMA + MMAv2)#8
blake-snc wants to merge 1 commit intogpu-mode:mainfrom
blake-snc:add/flash-attention-sm120-gluon

blake-snc commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

blake-snc commented Mar 3, 2026

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant