flash attention FA4 blackwell on sm120? #10564
                  
                    
                      voipmonitor
                    
                  
                
                  started this conversation in
                General
              
            Replies: 1 comment
-
| 
         only sm100  | 
  
Beta Was this translation helpful? Give feedback.
                  
                    0 replies
                  
                
            
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
        
    
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
is the new FA4 compatible with sm120 (RTX 6000 PRO, 5090 etc.) ?
THis is the PR: #9928
python3 -m sglang.launch_server
--model-path nvidia/DeepSeek-V3-0324-FP4
--tp 4 --attention-backend trtllm_mla
--moe-runner-backend flashinfer_trtllm
--quantization modelopt_fp4
--speculative-algorithm EAGLE --speculative-num-steps 3 --speculative-eagle-topk 1 --speculative-num-draft-tokens 4
--prefill-attention-backend fa4 --speculative-attention-mode decode
Beta Was this translation helpful? Give feedback.
All reactions