Skip to content

Conversation

@asolergi-nv
Copy link
Contributor

  • Set no_dist=True during checkpoint load to remove cross-rank synchronization during checkpoint load.
  • Use torch.distributed.checkpoint.state_dict_loader.load instead of torch.distributed.checkpoint.state_dict_loader.load_state_dict

@asolergi-nv asolergi-nv requested review from a team as code owners January 8, 2026 09:40
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 8, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot requested a review from Phlip79 January 8, 2026 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant