Commit 6b08c3e
committed
chore(beep boop 🤖): Bump
Signed-off-by: Oliver Koenig <[email protected]>uv.lock (r0.2.0) (2025-12-18)1 parent 1c43b39 commit 6b08c3e
2 files changed
+808
-637
lines changedSubmodule Megatron-LM updated 36 files
- .gitlab/stages/04.functional-tests.yml+1-1
- docs/source/api-guide/fine_grained_activation_offloading.md+31
- docs/source/api-guide/index.rst+1
- docs/source/images/fine_grained_activation_offloading/offloading_and_recomputing.png
- megatron/core/extensions/transformer_engine.py+35-1
- megatron/core/models/common/model_chunk_schedule_plan.py+8-1
- megatron/core/models/gpt/fine_grained_callables.py+18-5
- megatron/core/models/gpt/gpt_model.py+28-1
- megatron/core/optimizer/qk_clip.py+39
- megatron/core/pipeline_parallel/fine_grained_activation_offload.py+609
- megatron/core/pipeline_parallel/schedules.py+13-1
- megatron/core/tensor_parallel/random.py+10-3
- megatron/core/transformer/attention.py+160-14
- megatron/core/transformer/moe/README.md+14
- megatron/core/transformer/moe/experts.py+53-12
- megatron/core/transformer/multi_latent_attention.py+150-10
- megatron/core/transformer/multi_token_prediction.py+6-1
- megatron/core/transformer/transformer_block.py+9-1
- megatron/core/transformer/transformer_config.py+63-1
- megatron/core/transformer/transformer_layer.py+51-9
- megatron/training/arguments.py+34
- megatron/training/training.py+18-2
- tests/functional_tests/test_cases/gpt/gpt3_weekly_dgx_h100_mcore_tp2_pp2_current_scaling_native_fp8_tp_pp_sp_tp_overlap/golden_values_dev_dgx_h100.json+10.0k
- tests/functional_tests/test_cases/gpt/gpt3_weekly_dgx_h100_mcore_tp2_pp2_current_scaling_native_fp8_tp_pp_sp_tp_overlap/model_config.yaml+1-1
- tests/functional_tests/test_cases/mixtral/deepseekv3_proxy_flex_tp1pp4emp16etp1cp1_release_sm/golden_values_dev_dgx_h100.json+11.5k
- tests/functional_tests/test_cases/mixtral/deepseekv3_proxy_flex_tp1pp4emp16etp1cp1_release_sm/model_config.yaml+166
- tests/functional_tests/test_cases/moe/gpt3_moe_mcore_te_tp2_pp2_ep4_etp1_fine_grained_offloading/golden_values_dev_dgxh100_coreweave.json+344
- tests/functional_tests/test_cases/moe/gpt3_moe_mcore_te_tp2_pp2_ep4_etp1_fine_grained_offloading/golden_values_dev_dgxh100_eos.json+344
- tests/functional_tests/test_cases/moe/gpt3_moe_mcore_te_tp2_pp2_ep4_etp1_fine_grained_offloading/model_config.yaml+139
- tests/functional_tests/test_cases/moe/gpt3_moe_mcore_te_tp2_pp2_ep4_etp1_no_mtp_no_a2a_ovlp_fine_grained_offloading/golden_values_dev_dgxh100_coreweave.json+287
- tests/functional_tests/test_cases/moe/gpt3_moe_mcore_te_tp2_pp2_ep4_etp1_no_mtp_no_a2a_ovlp_fine_grained_offloading/golden_values_dev_dgxh100_eos.json+287
- tests/functional_tests/test_cases/moe/gpt3_moe_mcore_te_tp2_pp2_ep4_etp1_no_mtp_no_a2a_ovlp_fine_grained_offloading/model_config.yaml+134
- tests/test_utils/recipes/moe.yaml+10
- tests/unit_tests/pipeline_parallel/test_fine_grained_activation_offloading.py+187
- tests/unit_tests/transformer/test_attention.py+186
- tests/unit_tests/transformer/test_multi_latent_attention.py+228
0 commit comments