You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add FA4 monkey-patch path for low-precision attention
Adds FlashAttention 4 backend support for the monkey-patch SDPA path.
FA4 supports both Hopper (SM 9.x) and Blackwell (SM 10.x) hardware
with the flash_attn.cute.interface package.
Key additions:
- FP8_FA4 enum value in AttentionBackend
- _is_blackwell(), _is_fa4_available() hardware/library checks
- FA4 dispatch in apply_low_precision_attention
- fp8_fa4/ directory with fp8_fa4_sdpa entry point
- FA4 backend config in test suite with eager probe guard
- RoPE fusion placeholder (fuse_rope=True raises NotImplementedError)
ghstack-source-id: 05d86df
Pull-Request: pytorch#3960
0 commit comments