forked from Dao-AILab/flash-attention
    
        
        - 
                Notifications
    
You must be signed in to change notification settings  - Fork 101
 
Pull requests: vllm-project/flash-attention
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
      Add two-level accumulation for SM90 FP8 FWD to mitigate long-context accuracy drops
      
    
      
  
        
          #104
            opened Oct 29, 2025  by
            jmkuebler
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      [Kernel] add attention sinks for flash attention2
      
    
      
  
        
          #103
            opened Oct 19, 2025  by
            dudugong-gitch
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      feat: implement tree attention mask support for FlashAttention-2
      
    
      
  
        
          #81
            opened Aug 15, 2025  by
            foolusion
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      Removed the assertion imposed on cu_seqlens_k and seqused_k
      
    
      
  
        
          #59
            opened Mar 29, 2025  by
            chenyang78
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      Add back flash_attn_func api  (and support FA3) [Don't Merge Yet]
      
    
        
          #40
            opened Jan 26, 2025  by
            LucasWilkinson
            
        
        
            
    
  
    Loading…
 
        
        
      
    
  
  ProTip!
  Filter pull requests by the default branch with base:main.