Add Flash FMHA/FNA backend by AdityaKane2001 · Pull Request #278 · SHI-Labs/NATTEN

AdityaKane2001 · 2025-10-30T02:22:41Z

TL;DR: Ported over Flash Attention CUTLASS 3.x kernels to NATTEN as-is, with wrappers to fit into NATTEN. Exposing through the flash-fmha backend.

Summary of changes:

C++:
- Added actual kernel files under csrc/include/natten/cuda/flash_fmha/flash_kernel.
- Torch C++ interface at csrc/src/flash_fmha.cu, which call into csrc/.../flash_fmha/flash_fmha_{forward/bakcward}.cuh
- Added a utility file csrc/.../flash_kernel/param_utils.h for param conversion.
Python
- Exposed C++ function through flash-fmha backend.
- Added autograd function and configs for the same.
- Wherever possible, some arrangement is done to later implement flash-fna backend.
- Added autogen scripts and tests for flash-fma.

Present rough edges:

Python frontend might have some code style inconsistencies.
Stray template parameters for flash bwd template currently housed in flash_fmha_backward.cuh, as opposed to autogen. It seems that adding those to autogen scripts will make the scripts overly complex.
Although correctness is guaranteed (because of tests), no particular refactoring of the actual flash attn kernel code was done.
Important: Flash FMHA does not support causal argument. Please use Flash FNA 1D for this.

tests/test_fmha.py

AdityaKane2001 · 2025-11-19T19:08:43Z

Note to self: See comment in #282 when dealing with dilation.

AdityaKane2001 · 2025-11-29T22:13:45Z

Update: Flash FNA forward added as well.

Similar to flash-fmha, exposed through flash-fna backend. Only forward supported for now, will port over backward next.

AdityaKane2001 · 2025-11-29T22:16:28Z

Checklist for features required in Flash:

Different head dims for QK and V
Arbitrary head dims
GQA/MQA
Varlen (for FMHA)

AdityaKane2001 · 2026-01-16T22:50:24Z

@alihassanijr

Rebased onto main. All tests pass.

alihassanijr reviewed Nov 1, 2025

View reviewed changes

tests/test_fmha.py Outdated Show resolved Hide resolved

alihassanijr mentioned this pull request Nov 1, 2025

Updated setup.py and pyproject.toml #279

Open

AdityaKane2001 force-pushed the aditya/flash-na branch from 4be9fd3 to 290eeab Compare January 16, 2026 22:48

AdityaKane2001 changed the title ~~Add Flash FMHA backend~~ Add Flash FMHA/FNA backend Jan 16, 2026

AdityaKane2001 added 22 commits February 7, 2026 11:38

Move devsetup to shi-01

f5ee386

Flash FMHA fwd bwd works

68a1faa

Rebased

df9f5f4

Added flash FMHA (not FNA) frontend and tests, minor LSE bugfix

f788ca4

Cleaning up diff

1948ddf

Cleaning up diff again

577e61f

Cleaning up diff one last time

7567faf

One teeny tiny cleaning up of diff

cfe267a

Forgot one directory

fe6f4ad

Unnecessary diff delete

eda3130

Leftover from debugging

9201fe8

Flash FNA fwd works

09fd2d0

Flash FNA fwd C++ scaffolding done

57670c7

Flash FNA fwd Python backend done, allcloses

af2c6d8

Flash FNA fwd Python backend done, allcloses and runs fast

d49efbf

oopsie daisy

2ccc313

Flash FNA bwd allclosed, added Python frontend, all tests pass

0fc9560

Rebased and changed calls as needed

e5ca6af

Nits

a43e714

Deduplicated in Flash FNA

b78e692

Added exception for Flash

c777bbc

Added Ali Hassani copyrights

12cfab6

Removed unnecessary TODO

a831b38

AdityaKane2001 force-pushed the aditya/flash-na branch from 6317ce0 to a831b38 Compare February 7, 2026 17:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Flash FMHA/FNA backend#278

Add Flash FMHA/FNA backend#278
AdityaKane2001 wants to merge 23 commits intoSHI-Labs:mainfrom
AdityaKane2001:aditya/flash-na

AdityaKane2001 commented Oct 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

AdityaKane2001 commented Nov 19, 2025

Uh oh!

AdityaKane2001 commented Nov 29, 2025

Uh oh!

AdityaKane2001 commented Nov 29, 2025

Uh oh!

AdityaKane2001 commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AdityaKane2001 commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

AdityaKane2001 commented Nov 19, 2025

Uh oh!

AdityaKane2001 commented Nov 29, 2025

Uh oh!

AdityaKane2001 commented Nov 29, 2025

Uh oh!

AdityaKane2001 commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AdityaKane2001 commented Oct 30, 2025 •

edited

Loading