-
Notifications
You must be signed in to change notification settings - Fork 740
[CI] Disable auto CUDA arch injection to avoid duplicate gencode flags #7513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
EmmonsCurse
wants to merge
2
commits into
PaddlePaddle:develop
Choose a base branch
from
EmmonsCurse:fix_build_error_in_90or100
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+40
−1
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,9 +22,48 @@ | |
| from pathlib import Path | ||
|
|
||
| import paddle | ||
| from paddle.utils.cpp_extension import CppExtension, CUDAExtension, setup | ||
| from paddle.utils.cpp_extension import ( | ||
| CppExtension, | ||
| CUDAExtension, | ||
| extension_utils, | ||
| setup, | ||
| ) | ||
| from setuptools import find_namespace_packages, find_packages | ||
|
|
||
| # Workaround for Paddle PR #78704: | ||
| # Paddle 3.5.0.dev20260418+ changed CUDAExtension behavior to auto-add gencode flags | ||
| # based on PADDLE_CUDA_ARCH_LIST even when user provides arch flags in cflags. | ||
| # This causes relocation overflow in large CUDA files (e.g., append_attention.cu). | ||
| # | ||
| # This patch suppresses Paddle's auto-gencode addition when user-provided gencode | ||
| # flags are detected, preventing duplicate architecture specifications. | ||
| _original_get_cuda_arch_flags = extension_utils._get_cuda_arch_flags | ||
|
|
||
|
|
||
| def _patched_get_cuda_arch_flags(cflags=None): | ||
| """ | ||
| Patched version that returns empty list when user-provided gencode flags are detected. | ||
|
|
||
| This prevents Paddle from auto-adding duplicate gencode flags based on | ||
| PADDLE_CUDA_ARCH_LIST, which would cause relocation overflow errors. | ||
| """ | ||
| if cflags: | ||
| for flag in cflags: | ||
| if isinstance(flag, str) and (flag.startswith("-gencode") or "compute_" in flag or "sm_" in flag): | ||
| return [] | ||
| return _original_get_cuda_arch_flags(cflags) | ||
|
|
||
|
|
||
| extension_utils._get_cuda_arch_flags = _patched_get_cuda_arch_flags | ||
|
|
||
|
|
||
| # Additional safeguard (important): | ||
| # Some Paddle versions may have additional internal methods that add gencode flags. | ||
| # This patch serves as a second line of defense by overriding such methods. | ||
| if hasattr(extension_utils, "CUDAExtension"): | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🟡 建议
建议:
|
||
| if hasattr(extension_utils.CUDAExtension, "_add_cuda_arch_flags"): | ||
| extension_utils.CUDAExtension._add_cuda_arch_flags = lambda self, flags: flags | ||
|
|
||
|
|
||
| def load_module_from_path(module_name, path): | ||
| """ | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 建议 flag 检测逻辑存在误匹配风险。
"compute_" in flag和"sm_" in flag使用子串匹配,可能误匹配非 gencode 标志(例如包含路径中含sm_或compute_的-Iinclude 路径)。虽然目前场景下概率较低,但作为通用 patch 函数可以更精确。建议收紧匹配条件,例如仅检查
-gencode和-arch前缀: