Skip to content

Conversation

@vx120
Copy link

@vx120 vx120 commented Nov 19, 2025

PR type

  • New Feature

PR information

Add NPU_ENV check and repatch configuration for Megatron in NPU.

Add NPU_ENV check and repatch configuration for Megatron.
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @vx120, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces support for NPU (Neural Processing Unit) environments within the Megatron framework. It achieves this by adding NPU environment detection and conditionally applying a specialized repatch configuration to optimize transformer recompute settings when an NPU is present. This enhancement aims to improve Megatron's performance and compatibility on NPU hardware.

Highlights

  • NPU Environment Detection: Implemented a mechanism to detect the presence of an NPU environment by attempting to import mindspeed.megatron_adaptor.repatch and setting a global NPU_ENV flag accordingly.
  • Conditional Megatron Patching: Modified the _patch_megatron function to conditionally apply a specific repatch configuration when running in an NPU environment, overriding transformer recompute settings for granularity, modules, and method.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for NPU environments by conditionally importing mindspeed and applying necessary patches. The logic is sound, but I have a couple of suggestions to enhance code quality and maintainability. Specifically, I recommend using the logger for warning messages instead of print, and defining a static configuration dictionary as a module-level constant.

Comment on lines +682 to +687
override_transformer_config = {
"recompute_granularity": None,
"recompute_modules": ["core_attn"],
"recompute_method": None,
"recompute_num_layers": None
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The override_transformer_config dictionary is static. It's more efficient and readable to define it as a constant at the module level, outside of the _patch_megatron function.

For example:

# At the top of the file, e.g., after the logger is defined.
_NPU_OVERRIDE_TRANSFORMER_CONFIG = {
    "recompute_granularity": None,
    "recompute_modules": ["core_attn"],
    "recompute_method": None,
    "recompute_num_layers": None,
}

# ...

def _patch_megatron():
    if NPU_ENV:
        if repatch is not None:
            repatch(_NPU_OVERRIDE_TRANSFORMER_CONFIG)
        # ...

Comment on lines +690 to +691
else:
print('repatch没有正确导入')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It's better to use the logger for warnings or errors instead of print(). This ensures consistent logging behavior. Also, using an English message would improve maintainability and consistency with the rest of the codebase.

Suggested change
else:
print('repatch没有正确导入')
else:
logger.warning('`repatch` from `mindspeed.megatron_adaptor` was not imported correctly, skipping NPU patch.')

if repatch is not None:
repatch(override_transformer_config)
else:
print('repatch没有正确导入')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello. Please use English comments. 😊

from swift.llm import git_clone_github
from swift.utils import (get_logger, is_flash_attn_3_available, is_megatron_available, safe_ddp_context, split_list,
subprocess_run)
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里加一个判断吧,安全一些

from transformers.utils import is_torch_npu_available

@vx120
Copy link
Author

vx120 commented Nov 19, 2025 via email

@Jintao-Huang
Copy link
Collaborator

请运行以下代码(<py312),过一下lint测试

pip install pre-commit
pre-comit run --all-files

然后建议在文档中简单加一下NPU的内容。Megatron-SWIFT的快速开始文档。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants