fix cg acess issue by using dict instead of list to iteratively acces… by ilml · Pull Request #3867 · NVIDIA/Megatron-LM

ilml · 2026-03-13T21:28:15Z

Fix nested CUDA-graph attribute access for Flex MoE token dispatchers.

Summary

Use dotted-path helpers for token_dispatcher.cudagraph_attrs, so entries like _comm_manager.token_probs, _comm_manager.token_indices, and _comm_manager.routing_map are handled correctly.
Replace flat getattr / setattr usage in MoE CUDA-graph capture and replay paths with shared nested-attribute helpers.
Add unit tests covering nested cudagraph attribute reads and writes.

This keeps existing flat attribute handling unchanged while fixing Flex dispatcher backends such as deepep and hybridep.

What does this PR do ?

⚠️ For major changes (either in lines of code or in its impact), please make sure to first share a design doc with the team. If you're unsure what's the best way to do so, contact the @mcore-oncall.

Contribution process

Pre-checks

I have added relevant unit tests
I have added relevant functional tests
I have added proper typing to my code Typing guidelines
I have added relevant documentation
I have run the autoformatter.sh on my PR

Code review

Feel free to message or comment the @mcore-oncall to help accelerate your merge into main. The less complex your PR is, the faster it will be approved and merged!

All PRs start as draft. If you open a non-draft PR, it will be automatically converted to draft.

Step 1: Mark PR as "Ready for Review"

When your PR is ready, click Ready for Review.
An oncall reviewer is auto-assigned and expert reviewers are notified based on your changes.
- Some PRs may jump straight to step 2. This is determined by .github/CODEOWNERS.

⚠️ Only mark as ready once merge-conflicts are resolved and the CI is passing.
Final Review might get declined if these requirements are not fulfilled.

Step 2: Final Review

For PRs that change megatron/core, once all expert reviewers have approved, the Final Review label is applied automatically and final reviewers are assigned.

For PRs outside megatron/core, this step is skipped.

Step 3: Approved

Once all required reviewers have approved, the Approved label is applied automatically.

Merge

Any member of mcore-engineers will be able to merge your PR.

For MRs into `dev` branch

The proposed review process for `dev` branch is under active discussion.

MRs are mergable after one approval by either eharper@nvidia.com or zijiey@nvidia.com.

…s it

copy-pr-bot · 2026-03-13T21:28:19Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

jiemingz · 2026-03-16T14:36:06Z

https://github.com/NVIDIA/Megatron-LM/pull/3625/changes

There is an equivalent change on main, why not reflect this PR?

update: main has the exact code now except it misses the test, and doesnt refactor the setter and getter into their own functions. When we go to merge dev into main I can take an AI to make sure these get into main as well

yaox12 · 2026-03-16T18:10:16Z

/ok to test bddcdb3

svcnvidia-nemo-ci · 2026-03-16T19:16:48Z