- 
                Notifications
    
You must be signed in to change notification settings  - Fork 3.2k
 
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
      [Dev] Remove calculation of padding token in moe routing loss
      
    
      
  
        
          #2121
            opened Nov 4, 2025  by
            HaochenYuan
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    6 tasks
  
      Fix runaway Etpt in straggler detector by resetting FLOPs accumulator
        
              
                bug
  Something isn't working 
              
                Expert Review
  Apply this label to indicate that your PR is ready for expert review. 
        
      
    
      
  
        
          #2119
            opened Nov 4, 2025  by
            sbhavani
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      remove training dependency from megatron core for fsdp checkpoint with EP
        
              
                core_r0.15.0
              
                Expert Review
  Apply this label to indicate that your PR is ready for expert review. 
        
      
    
    
      
  
      Refactor model_provider to model_builder format for ModelOpt examples
        
              
                Expert Review
  Apply this label to indicate that your PR is ready for expert review. 
              
                Run tests
        
      
    
    
      
  
      Tensorize dynamic inference mixed sampling
        
              
                Expert Review
  Apply this label to indicate that your PR is ready for expert review. 
              
                Run functional tests
  Trains for 50-100 steps and tests against golden values 
              
                Run tests
        
      
    
    
      
  
      multi thread read full parallel save ckpt
      
    
      
  
        
          #2104
            opened Nov 3, 2025  by
            861482002
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    6 tasks done
  
      Add router replay for MoE models
        
              
                module: moe
        
      
    
      
  
        
          #2101
            opened Nov 3, 2025  by
            litianjian
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    6 tasks
  
      ci: Run functional tests
        
              
                Run functional tests
  Trains for 50-100 steps and tests against golden values 
        
      
    
    
      [Dev] [Draft] FP8 params support for megatron-fsdp
      
    
      
  
        
          #2086
            opened Nov 2, 2025  by
            kunlunl
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    6 tasks
  
      Fix ambiguous tensor truth-value check in train_rl.loss_func (use .it…
        
              
                Expert Review
  Apply this label to indicate that your PR is ready for expert review. 
        
      
    
      
  
        
          #2085
            opened Nov 2, 2025  by
            vignesh1507
            
        
        
            
    
  
    Loading…
 
        
        
      
    Previous Next
  
  
  ProTip!
  Exclude everything labeled 
    bug with -label:bug.