[CI] Disable auto CUDA arch injection to avoid duplicate gencode flags by EmmonsCurse · Pull Request #7513 · PaddlePaddle/FastDeploy

EmmonsCurse · 2026-04-20T07:11:44Z

Motivation

Recent changes in Paddle #78704 modified the behavior of CUDAExtension, introduced automatic CUDA architecture flag injection via PADDLE_CUDA_ARCH_LIST, even when custom -gencode flags are already specified.

This results in duplicated CUDA arch flags during compilation, increasing binary size and potentially causing linker errors such as:

relocation truncated to fit

To maintain stable builds and avoid unnecessary code generation, a workaround is required.

Modifications

Patched extension_utils._get_cuda_arch_flags to return an empty list when user-defined -gencode flags are detected, preventing Paddle from auto-injecting CUDA arch flags.
Added a secondary safeguard by overriding CUDAExtension._add_cuda_arch_flags to ensure no additional arch flags are appended internally.
Explicitly controlled CUDA architecture flags via get_gencode_flags, avoiding reliance on PADDLE_CUDA_ARCH_LIST.
Effectively disabled Paddle’s automatic CUDA arch injection mechanism to prevent duplicated -gencode entries.
Ensured correct generation of arch=compute_xxa,code=sm_xxa pairs (e.g., 90a, 100a) and avoided incomplete flags like arch=compute_90a.
Reduced the risk of compilation and linking issues (e.g., relocation overflow) caused by conflicting or duplicated CUDA arch flags.

Usage or Command

N/A

Accuracy Tests

N/A

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2026-04-20T07:11:51Z

Thanks for your contribution!

EmmonsCurse · 2026-04-20T07:11:54Z

/skip-ci ci_iluvatar
/skip-ci ci_hpu
/skip-ci stable_test
/skip-ci base_test /skip-ci pre_ce_test

PaddlePaddle-bot

🤖 AI Code Review | 2026-04-20 17:10:08

📋 Review 摘要

PR 概述：通过 monkey-patch Paddle 的 _get_cuda_arch_flags 函数，防止 Paddle PR #78704 引入的自动 CUDA arch flag 注入导致重复 gencode 标志和链接错误。
变更范围：custom_ops/setup_ops.py（构建配置）
影响面 Tag：CI OP

📝 PR 规范检查

PR 描述中 Modifications 部分与实际实现不完全一致：描述称"Overrode PADDLE_CUDA_ARCH_LIST by setting it to an empty string in get_gencode_flags"，但实际实现是 monkey-patch extension_utils._get_cuda_arch_flags，未修改 get_gencode_flags 函数本身，也未设置环境变量。建议更新描述以准确反映实际实现方式。

问题

级别	文件	概述
🟡 建议	`custom_ops/setup_ops.py:63`	"第二道防线" safeguard 大概率为死代码
🟡 建议	`custom_ops/setup_ops.py:52`	flag 子串匹配存在误匹配风险

总体评价

该 PR 通过 monkey-patch 方式解决了 Paddle 上游变更导致的重复 gencode 标志问题，核心 patch 逻辑（_patched_get_cuda_arch_flags）思路正确，注释详尽。两个小建议：1) "第二道防线" 代码块可能无法生效，建议确认或移除；2) flag 匹配条件可以更精确以避免潜在误判。

PaddlePaddle-bot · 2026-04-20T09:11:54Z

+# Additional safeguard (important):
+# Some Paddle versions may have additional internal methods that add gencode flags.
+# This patch serves as a second line of defense by overriding such methods.
+if hasattr(extension_utils, "CUDAExtension"):


🟡 建议 extension_utils 模块上不太可能存在 CUDAExtension 类属性。

CUDAExtension 是从 paddle.utils.cpp_extension 导入的函数/类，而非 extension_utils 模块的属性。因此 hasattr(extension_utils, "CUDAExtension") 大概率为 False，这段 "第二道防线" 实际上是死代码，无法起到防护作用。

建议：

确认 Paddle 对应版本中 extension_utils 模块是否确实有 CUDAExtension 属性，如果没有则移除这段代码以避免误导；

如果确需额外防护，可考虑直接 patch paddle.utils.cpp_extension.CUDAExtension 本身。

PaddlePaddle-bot · 2026-04-20T09:11:54Z

+    """
+    if cflags:
+        for flag in cflags:
+            if isinstance(flag, str) and (flag.startswith("-gencode") or "compute_" in flag or "sm_" in flag):


🟡 建议 flag 检测逻辑存在误匹配风险。

"compute_" in flag 和 "sm_" in flag 使用子串匹配，可能误匹配非 gencode 标志（例如包含路径中含 sm_ 或 compute_ 的 -I include 路径）。虽然目前场景下概率较低，但作为通用 patch 函数可以更精确。

建议收紧匹配条件，例如仅检查 -gencode 和 -arch 前缀：

if isinstance(flag, str) and (flag.startswith("-gencode") or flag.startswith("-arch")): return []

[CI] Disable auto CUDA arch injection to avoid duplicate gencode flags

4d89772

EmmonsCurse had a problem deploying to Metax_ci April 20, 2026 07:11 — with GitHub Actions Error

This comment was marked as outdated.

Sign in to view

[CI] Workaround for auto CUDA arch injection

d796e33

EmmonsCurse had a problem deploying to Metax_ci April 20, 2026 08:15 — with GitHub Actions Failure

PaddlePaddle-bot reviewed Apr 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] Disable auto CUDA arch injection to avoid duplicate gencode flags#7513

[CI] Disable auto CUDA arch injection to avoid duplicate gencode flags#7513
EmmonsCurse wants to merge 2 commits intoPaddlePaddle:developfrom
EmmonsCurse:fix_build_error_in_90or100

EmmonsCurse commented Apr 20, 2026 •

edited

Loading

Uh oh!

paddle-bot bot commented Apr 20, 2026

Uh oh!

EmmonsCurse commented Apr 20, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot Apr 20, 2026

Uh oh!

PaddlePaddle-bot Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

EmmonsCurse commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Apr 20, 2026

Uh oh!

EmmonsCurse commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

📝 PR 规范检查

问题

总体评价

Uh oh!

PaddlePaddle-bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

PaddlePaddle-bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

EmmonsCurse commented Apr 20, 2026 •

edited

Loading

EmmonsCurse commented Apr 20, 2026 •

edited

Loading