Skip to content

Phase 1: Add new files from main (main-to-dev migration)#3968

Open
ilml wants to merge 25 commits intoNVIDIA:devfrom
ilml:main2dev
Open

Phase 1: Add new files from main (main-to-dev migration)#3968
ilml wants to merge 25 commits intoNVIDIA:devfrom
ilml:main2dev

Conversation

@ilml
Copy link
Contributor

@ilml ilml commented Mar 20, 2026

Summary

Phase 1 of the main-to-dev code migration. This PR adds 57 new files (+15,771 lines) that exist on main but not on dev. These are pure file additions with zero modifications to existing files, so there should be no merge conflicts.

New files include:

Source commits on main: Each new file is extracted at the state it was first introduced by its original commit on main. The 57 files come from 23 distinct main-branch commits.

Context

This is part of a larger main-to-dev migration (206 commits). The strategy is:

  1. Phase 1 (this PR): Add new files -- conflict-free, pure additions
  2. Phase 2 (follow-up): Cherry-pick all 131 code commits -- new-file portions will auto-merge since they are already in place, reducing conflict surface to existing-file modifications only

Test plan

  • No existing tests should break (pure additions, no existing file modifications)
  • CI passes
  • New test files are syntactically valid (will be functionally tested in Phase 2 when corresponding code changes land)

Made with Cursor

ilml added 23 commits March 20, 2026 18:29
…NVIDIA#3570)

New files:
  - megatron/training/config/__init__.py
  - megatron/training/config/common_config.py
  - megatron/training/config/resilience_config.py
  - megatron/training/config/training_config.py
…my ep cuda-graphed forward passes (NVIDIA#3525)

New files:
  - tests/unit_tests/inference/test_batch_dimension_utils.py
…A#3058)

New files:
  - tests/unit_tests/transformer/test_mup.py
…tron Bridge (NVIDIA#3018)

New files:
  - examples/gptoss/01_convert_from_hf.py
  - examples/gptoss/02_train.sh
  - examples/gptoss/03_convert_to_hf.py
New files:
  - tests/unit_tests/inference/contexts/test_dynamic_prefix_caching.py
New files:
  - megatron/core/transformer/moe/token_dispatcher_inference.py
  - tests/unit_tests/inference/test_moe_inference.py
…VIDIA#3665)

New files:
  - tests/unit_tests/inference/test_dynamic_prefix_caching_coordinator.py
New files:
  - megatron/core/resharding/nvshmem_copy_service/compat.py
…NVIDIA#3648)

New files:
  - megatron/core/inference/text_generation_server/dynamic_text_gen_server/text_generation_server.py
…n Encoder (NVIDIA#3293)

New files:
  - tests/unit_tests/transformer/test_vision_cuda_graphs.py
…3384)

New files:
  - tests/unit_tests/fusions/test_rmsnorm_residual_fusion.py
…NVIDIA#2135)

New files:
  - megatron/core/models/mimo/partition/utils.py
  - tests/unit_tests/models/test_mimo_partition.py
New files:
  - megatron/core/resharding/transforms.py
  - tests/unit_tests/resharding/test_mxfp8_refit.py
… to ModelOpt examples (NVIDIA#3805)

New files:
  - examples/post_training/modelopt/conf/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16.sh
…layers (NVIDIA#3594)

New files:
  - megatron/core/ssm/ops/__init__.py
  - megatron/core/ssm/ops/causal_conv1d_triton.py
  - megatron/core/ssm/ops/mamba_ssm.py
…VIDIA#3817)

New files:
  - tests/unit_tests/ssm/test_causal_conv1d_triton.py
…rmer-impl inference_optimized (NVIDIA#3851)

New files:
  - megatron/core/inference/symmetric_memory.py
…ort w/ cuda graphed + inference_optimized MoEs (NVIDIA#3858)

New files:
  - megatron/core/inference/moe/__init__.py
  - megatron/core/inference/moe/activations.py
  - megatron/core/inference/moe/fused_moe.py
  - megatron/core/inference/moe/pad.py
  - megatron/core/inference/moe/permute.py
  - megatron/core/inference/quantization/mxfp8_quantize.py
  - tests/unit_tests/inference/test_moe_permute.py
  - tests/unit_tests/inference/test_mxfp8_utils.py
New files:
  - tests/unit_tests/test_lion_optimizer.py
…edule (NVIDIA#3129)

New files:
  - tests/unit_tests/pipeline_parallel/test_multimodule_schedules.py
…#3225)

New files:
  - megatron/core/inference/contexts/kv_block_allocator.py
  - megatron/core/inference/contexts/mamba_slot_allocator.py
  - megatron/core/ssm/ops/causal_conv1d_varlen.py
  - megatron/core/ssm/ops/determinism.py
  - megatron/core/ssm/ops/ssd_bmm.py
  - megatron/core/ssm/ops/ssd_chunk_scan.py
  - megatron/core/ssm/ops/ssd_chunk_state.py
  - megatron/core/ssm/ops/ssd_combined.py
  - megatron/core/ssm/ops/ssd_state_passing.py
  - tests/unit_tests/inference/engines/test_mamba_prefix_caching_e2e.py
  - tests/unit_tests/ssm/ops/test_causal_conv1d_varlen.py
  - tests/unit_tests/ssm/ops/test_ops_init.py
  - tests/unit_tests/ssm/ops/test_ssd_bmm.py
  - tests/unit_tests/ssm/ops/test_ssd_chunk_scan.py
  - tests/unit_tests/ssm/ops/test_ssd_chunk_state.py
  - tests/unit_tests/ssm/ops/test_ssd_combined.py
  - tests/unit_tests/ssm/ops/test_ssd_state_passing.py
  - tests/unit_tests/ssm/ops/test_ssm_kernel.py
New files:
  - tests/unit_tests/rl/test_grouped_rollouts.py
@ilml ilml requested review from a team as code owners March 20, 2026 18:31
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 20, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@ilml
Copy link
Contributor Author

ilml commented Mar 20, 2026

/ok to test b18c7a6

These test files import from existing modules that are modified in Phase 2:
- test_rmsnorm_residual_fusion.py: imports TEFusedResidualRMSNorm (added in NVIDIA#3384)
- test_mup.py: imports get_mup_config_overrides (added in NVIDIA#3058)
- test_multimodule_schedules.py: imports MultiModuleProcessGroupCollection (added in NVIDIA#3129)

They will be re-added in Phase 2 when the corresponding code changes land.

Made-with: Cursor
These test files import symbols from existing modules that are only
added in Phase 2 commits:

- test_dynamic_prefix_caching.py: PrefixCachingEvictionPolicy, HASH_PRIME
- test_mamba_prefix_caching_e2e.py: PrefixCachingEvictionPolicy
- test_dynamic_prefix_caching_coordinator.py: PrefixCachingCoordinatorPolicy
- test_moe_inference.py: are_tensors_nvls_eligible, InferenceTopKRouter
- test_grouped_rollouts.py: RolloutGroup, ReturnsRaw
- test_lion_optimizer.py: HAVE_LION
- test_vision_cuda_graphs.py: VisionTECudaGraphHelper, HAVE_TE_GRAPHS

They will be re-added in Phase 2 with their corresponding code changes.

Made-with: Cursor
@ilml
Copy link
Contributor Author

ilml commented Mar 20, 2026

/ok to test ab305c3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants