Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Cherry-pick PR#1916 to core_dev_r0.15.0
#2120 opened Nov 4, 2025 by BestJuly Loading…
6 tasks
Fix runaway Etpt in straggler detector by resetting FLOPs accumulator bug Something isn't working Expert Review Apply this label to indicate that your PR is ready for expert review.
#2119 opened Nov 4, 2025 by sbhavani Loading…
[Dev] Fixes for gpt-oss
#2116 opened Nov 3, 2025 by cuichenx Queued
6 tasks
Core 0.16
[Dev] Nemotron nano v2 vl
#2115 opened Nov 3, 2025 by cuichenx Queued
6 tasks
Core 0.16
remove training dependency from megatron core for fsdp checkpoint with EP core_r0.15.0 Expert Review Apply this label to indicate that your PR is ready for expert review.
#2113 opened Nov 3, 2025 by ananthsub Loading…
3 of 6 tasks
Core 0.15
Refactor Attention Metadata to Separate Classes
#2112 opened Nov 3, 2025 by kanz-nv Loading…
6 tasks
Core 0.16
Update README.md
#2111 opened Nov 3, 2025 by mvirts Loading…
6 tasks
Refactor model_provider to model_builder format for ModelOpt examples Expert Review Apply this label to indicate that your PR is ready for expert review. Run tests
#2107 opened Nov 3, 2025 by AAnoosheh Loading…
2 of 6 tasks
Core 0.16
Tensorize dynamic inference mixed sampling Expert Review Apply this label to indicate that your PR is ready for expert review. Run functional tests Trains for 50-100 steps and tests against golden values Run tests
#2105 opened Nov 3, 2025 by tdene Loading…
6 tasks
Core 0.16
multi thread read full parallel save ckpt
#2104 opened Nov 3, 2025 by 861482002 Loading…
6 tasks done
Add router replay for MoE models module: moe
#2101 opened Nov 3, 2025 by litianjian Loading…
6 tasks
ci: Run functional tests Run functional tests Trains for 50-100 steps and tests against golden values
#2100 opened Nov 3, 2025 by ko3n1g Loading…
6 tasks
Core 0.16
Ko3n1g/chore/update dev release settings
#2099 opened Nov 3, 2025 by ko3n1g Loading…
6 tasks
Core 0.16
Remove redundant reduce in aux_loss logging
#2095 opened Nov 3, 2025 by BestJuly Loading…
6 tasks
Core 0.16
[Dev] Remove redundant reduce in aux_loss logging
#2094 opened Nov 3, 2025 by BestJuly Loading…
6 tasks
Core 0.16
chore: Merge main into dev
#2093 opened Nov 3, 2025 by chtruong814 Loading…
6 tasks
Core 0.16
[Dev] Changes to support multimodule pipelining
#2092 opened Nov 3, 2025 by yashaswikarnati Loading…
2 of 6 tasks
Core 0.16
Add repr to pg collection class
#2089 opened Nov 2, 2025 by yashaswikarnati Loading…
6 tasks
Core 0.16
fp8 param cuda graph support main
#2088 opened Nov 2, 2025 by kunlunl Loading…
6 tasks
[dev] fp8 param cuda graph support bug Something isn't working
#2087 opened Nov 2, 2025 by kunlunl Loading…
6 tasks
Core 0.16
[Dev] [Draft] FP8 params support for megatron-fsdp
#2086 opened Nov 2, 2025 by kunlunl Loading…
6 tasks
Fix ambiguous tensor truth-value check in train_rl.loss_func (use .it… Expert Review Apply this label to indicate that your PR is ready for expert review.
#2085 opened Nov 2, 2025 by vignesh1507 Loading…
ProTip! Exclude everything labeled bug with -label:bug.