Skip to content

Conversation

@ji-huazhong
Copy link
Collaborator

@ji-huazhong ji-huazhong commented Dec 31, 2025

What does this PR do?

As per title.

Checklist Before Starting

  • Search for similar PRs. Paste at least one query link here: ...
  • Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
    • {modules} include fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data, cfg, reward
    • If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
    • {type} is in feat, fix, refactor, chore, test
    • If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
    • Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the import of the optional mindspeed patch to a central location, which is a good improvement for handling optional dependencies. My review points out that this refactoring has made the try...except block in verl/workers/megatron_workers.py redundant. I've included a suggestion to remove this block to simplify the code and improve clarity.

Comment on lines 30 to 33
try:
from mindspeed.megatron_adaptor import repatch
from verl.workers.engine.mindspeed.transformer_impl import repatch
except ImportError:
repatch = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The try...except ImportError block is now redundant. The verl.workers.engine.mindspeed.transformer_impl module handles the case where mindspeed is not installed by setting repatch to None. Therefore, this import will no longer raise an ImportError under normal circumstances, and this try...except block can be removed to improve code clarity.

from verl.workers.engine.mindspeed.transformer_impl import repatch

@ji-huazhong ji-huazhong changed the title fixed the issue of repeatedly importing the mindspeed patch [trainer,megatron] fix: super tiny fix the issue of repeatedly importing the mindspeed patch Dec 31, 2025
@ji-huazhong
Copy link
Collaborator Author

ji-huazhong commented Dec 31, 2025

The failed test cases are caused by network issues and are irrelevant to this PR.

I will fix these flaky CI issues in a subsequent PR.

@vermouth1992 vermouth1992 merged commit 4017d63 into volcengine:main Dec 31, 2025
72 of 76 checks passed
@vermouth1992
Copy link
Collaborator

The failed test cases are caused by network issues and are irrelevant to this PR.

I will fix these flaky CI issues in a subsequent PR.

Those tests have been fixed on main.

@ji-huazhong ji-huazhong deleted the npu branch December 31, 2025 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants