ci: install nvidia-resiliency-ext from source#3861
Conversation
|
This PR has been automatically converted to draft because all PRs must start as drafts. When you are ready for review, click Ready for Review to begin the review process. This will:
See the contribution guide for more details. |
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Add git source for nvidia-resiliency-ext in [tool.uv.sources], mirroring the approach used in NeMo-LM, and update the lockfile accordingly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
c7f8534 to
9a196e1
Compare
|
🔄 Merge queue validation started! You can track the progress here: https://github.com/NVIDIA/Megatron-LM/actions/runs/23064018404 |
|
🔄 Merge queue validation started! You can track the progress here: https://github.com/NVIDIA/Megatron-LM/actions/runs/23065690323 |
|
🔄 Merge queue validation started! You can track the progress here: https://github.com/NVIDIA/Megatron-LM/actions/runs/23068149121 |
|
🔄 Merge queue validation started! You can track the progress here: https://github.com/NVIDIA/Megatron-LM/actions/runs/23073569002 |
Summary
nvidia-resiliency-extto[tool.uv.sources]inpyproject.tomlto install from the GitHub source atv0.5.0, mirroring the approach used in NeMo-LMuv.lockaccordinglyWhy?
This was asked for by the NVRX team who's doing a code-migration of checkpointing logic from MLM to NVRX
Test plan
uv syncresolvesnvidia-resiliency-extfrom git source🤖 Generated with Claude Code