Phase 1: Add new files from main (main-to-dev migration)#3968
Open
ilml wants to merge 25 commits intoNVIDIA:devfrom
Open
Phase 1: Add new files from main (main-to-dev migration)#3968ilml wants to merge 25 commits intoNVIDIA:devfrom
ilml wants to merge 25 commits intoNVIDIA:devfrom
Conversation
…NVIDIA#3570) New files: - megatron/training/config/__init__.py - megatron/training/config/common_config.py - megatron/training/config/resilience_config.py - megatron/training/config/training_config.py
…my ep cuda-graphed forward passes (NVIDIA#3525) New files: - tests/unit_tests/inference/test_batch_dimension_utils.py
…A#3058) New files: - tests/unit_tests/transformer/test_mup.py
…tron Bridge (NVIDIA#3018) New files: - examples/gptoss/01_convert_from_hf.py - examples/gptoss/02_train.sh - examples/gptoss/03_convert_to_hf.py
New files: - tests/unit_tests/inference/contexts/test_dynamic_prefix_caching.py
New files: - megatron/core/transformer/moe/token_dispatcher_inference.py - tests/unit_tests/inference/test_moe_inference.py
…VIDIA#3665) New files: - tests/unit_tests/inference/test_dynamic_prefix_caching_coordinator.py
New files: - megatron/core/resharding/nvshmem_copy_service/compat.py
New files: - tools/trigger_internal_ci.py
…NVIDIA#3648) New files: - megatron/core/inference/text_generation_server/dynamic_text_gen_server/text_generation_server.py
…n Encoder (NVIDIA#3293) New files: - tests/unit_tests/transformer/test_vision_cuda_graphs.py
…3384) New files: - tests/unit_tests/fusions/test_rmsnorm_residual_fusion.py
…NVIDIA#2135) New files: - megatron/core/models/mimo/partition/utils.py - tests/unit_tests/models/test_mimo_partition.py
New files: - megatron/core/resharding/transforms.py - tests/unit_tests/resharding/test_mxfp8_refit.py
… to ModelOpt examples (NVIDIA#3805) New files: - examples/post_training/modelopt/conf/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16.sh
…layers (NVIDIA#3594) New files: - megatron/core/ssm/ops/__init__.py - megatron/core/ssm/ops/causal_conv1d_triton.py - megatron/core/ssm/ops/mamba_ssm.py
…VIDIA#3817) New files: - tests/unit_tests/ssm/test_causal_conv1d_triton.py
…rmer-impl inference_optimized (NVIDIA#3851) New files: - megatron/core/inference/symmetric_memory.py
…ort w/ cuda graphed + inference_optimized MoEs (NVIDIA#3858) New files: - megatron/core/inference/moe/__init__.py - megatron/core/inference/moe/activations.py - megatron/core/inference/moe/fused_moe.py - megatron/core/inference/moe/pad.py - megatron/core/inference/moe/permute.py - megatron/core/inference/quantization/mxfp8_quantize.py - tests/unit_tests/inference/test_moe_permute.py - tests/unit_tests/inference/test_mxfp8_utils.py
New files: - tests/unit_tests/test_lion_optimizer.py
…edule (NVIDIA#3129) New files: - tests/unit_tests/pipeline_parallel/test_multimodule_schedules.py
…#3225) New files: - megatron/core/inference/contexts/kv_block_allocator.py - megatron/core/inference/contexts/mamba_slot_allocator.py - megatron/core/ssm/ops/causal_conv1d_varlen.py - megatron/core/ssm/ops/determinism.py - megatron/core/ssm/ops/ssd_bmm.py - megatron/core/ssm/ops/ssd_chunk_scan.py - megatron/core/ssm/ops/ssd_chunk_state.py - megatron/core/ssm/ops/ssd_combined.py - megatron/core/ssm/ops/ssd_state_passing.py - tests/unit_tests/inference/engines/test_mamba_prefix_caching_e2e.py - tests/unit_tests/ssm/ops/test_causal_conv1d_varlen.py - tests/unit_tests/ssm/ops/test_ops_init.py - tests/unit_tests/ssm/ops/test_ssd_bmm.py - tests/unit_tests/ssm/ops/test_ssd_chunk_scan.py - tests/unit_tests/ssm/ops/test_ssd_chunk_state.py - tests/unit_tests/ssm/ops/test_ssd_combined.py - tests/unit_tests/ssm/ops/test_ssd_state_passing.py - tests/unit_tests/ssm/ops/test_ssm_kernel.py
New files: - tests/unit_tests/rl/test_grouped_rollouts.py
Contributor
Author
|
/ok to test b18c7a6 |
These test files import from existing modules that are modified in Phase 2: - test_rmsnorm_residual_fusion.py: imports TEFusedResidualRMSNorm (added in NVIDIA#3384) - test_mup.py: imports get_mup_config_overrides (added in NVIDIA#3058) - test_multimodule_schedules.py: imports MultiModuleProcessGroupCollection (added in NVIDIA#3129) They will be re-added in Phase 2 when the corresponding code changes land. Made-with: Cursor
These test files import symbols from existing modules that are only added in Phase 2 commits: - test_dynamic_prefix_caching.py: PrefixCachingEvictionPolicy, HASH_PRIME - test_mamba_prefix_caching_e2e.py: PrefixCachingEvictionPolicy - test_dynamic_prefix_caching_coordinator.py: PrefixCachingCoordinatorPolicy - test_moe_inference.py: are_tensors_nvls_eligible, InferenceTopKRouter - test_grouped_rollouts.py: RolloutGroup, ReturnsRaw - test_lion_optimizer.py: HAVE_LION - test_vision_cuda_graphs.py: VisionTECudaGraphHelper, HAVE_TE_GRAPHS They will be re-added in Phase 2 with their corresponding code changes. Made-with: Cursor
Contributor
Author
|
/ok to test ab305c3 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 1 of the main-to-dev code migration. This PR adds 57 new files (+15,771 lines) that exist on
mainbut not ondev. These are pure file additions with zero modifications to existing files, so there should be no merge conflicts.New files include:
megatron/training/config/(4 files) -- config src refactor from Move config src files into a dedicated dir #3570megatron/core/inference/moe/(5 files) -- inference-optimized MoEs from Add torch grouped gemm bf16 and mxfp8 support w/ cuda graphed + inference_optimized MoEs #3858megatron/core/ssm/ops/(9 files) -- SSM/Mamba triton ops from Add speculative decoding support with MTP layers #3594, Inference | Hybrid prefix caching. #3225megatron/core/inference/contexts/(2 files) -- KV/Mamba block allocators from Inference | Hybrid prefix caching. #3225megatron/core/inference/misc (3 files) -- symmetric memory, mxfp8, text gen servermegatron/core/resharding/(2 files) -- MXFP8 refit transforms, nvshmem compatmegatron/core/transformer/moe/token_dispatcher_inference.py-- inference dispatchermegatron/core/models/mimo/partition/utils.py-- Mimo partition utilstests/unit_tests/tools/trigger_internal_ci.pySource commits on main: Each new file is extracted at the state it was first introduced by its original commit on main. The 57 files come from 23 distinct main-branch commits.
Context
This is part of a larger main-to-dev migration (206 commits). The strategy is:
Test plan
Made with Cursor