[PG][EP] Fix engine.so runtime dependency#2255
Conversation
Signed-off-by: Mohammad Miadh Angkad <176301910+mmangkad@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request updates the linker arguments in mooncake-ep/setup.py and mooncake-pg/setup.py to use --push-state,--no-as-needed and --pop-state when linking engine.so. The reviewer pointed out that --push-state and --pop-state are not supported by the gold linker, which could lead to build failures. They recommended a more portable approach of explicitly toggling --no-as-needed and --as-needed instead.
| "-Wl,--push-state,--no-as-needed", | ||
| "-l:engine.so", | ||
| "-Wl,--pop-state", |
There was a problem hiding this comment.
Using --push-state and --pop-state can cause build failures if the build environment is configured to use the gold linker (e.g., via -fuse-ld=gold), as gold does not support these state-saving options.
A more portable and widely compatible approach to temporarily disable --as-needed for a specific library is to explicitly toggle it off and on using "--no-as-needed" and "--as-needed". Since --as-needed is the default behavior for modern toolchains, restoring it explicitly is safe and highly compatible across GNU ld, gold, and lld.
| "-Wl,--push-state,--no-as-needed", | |
| "-l:engine.so", | |
| "-Wl,--pop-state", | |
| "-Wl,--no-as-needed", | |
| "-l:engine.so", | |
| "-Wl,--as-needed", |
There was a problem hiding this comment.
I prefer keeping this because push-state/pop-state restores the prior linker state exactly, and Mooncake’s CI/release path does not appear to use old gold, but I can switch to --no-as-needed/--as-needed if maintainers prefer that compatibility tradeoff.
| "-Wl,--push-state,--no-as-needed", | ||
| "-l:engine.so", | ||
| "-Wl,--pop-state", |
There was a problem hiding this comment.
Using --push-state and --pop-state can cause build failures if the build environment is configured to use the gold linker (e.g., via -fuse-ld=gold), as gold does not support these state-saving options.
A more portable and widely compatible approach to temporarily disable --as-needed for a specific library is to explicitly toggle it off and on using "--no-as-needed" and "--as-needed". Since --as-needed is the default behavior for modern toolchains, restoring it explicitly is safe and highly compatible across GNU ld, gold, and lld.
| "-Wl,--push-state,--no-as-needed", | |
| "-l:engine.so", | |
| "-Wl,--pop-state", | |
| "-Wl,--no-as-needed", | |
| "-l:engine.so", | |
| "-Wl,--as-needed", |
Description
This fixes the torch 2.12 Mooncake EP import failure where the built
ep_2_12_0extension does not recordengine.soas a direct runtime dependency.The failure mode is an import-time unresolved symbol from glog:
I checked the installed Mooncake
0.3.11.post1wheel and found that the torch 2.12 EP artifact is the only EP artifact missing the directengine.sodependency:ep_2_12_0still has unresolved glog references:Directly importing the exact torch 2.12 EP module fails:
Preloading
mooncake.enginewithRTLD_GLOBALmakes the same import pass, which confirms the missing dependency is the issue:Mooncake already links EP/PG with
-l:engine.so, but in torch 2.12 builds that dependency can be dropped by linker--as-neededbehavior. This PR wraps only-l:engine.sowith:This preserves the existing linker state for all other libraries while forcing
engine.soto remain a directDT_NEEDEDdependency, matching the working EP wheel contract from torch 2.9/2.10/2.11.Module
mooncake-transfer-engine)mooncake-store)mooncake-ep)mooncake-integration)mooncake-p2p-store)mooncake-wheel)mooncake-pg)mooncake-rl)Type of Change
How Has This Been Tested?
Verified the current installed wheel behavior before the fix:
Observed that
ep_2_12_0is missingNEEDED engine.so, whileep_2_9_1,ep_2_10_0, andep_2_11_0include it.Verified the import failure and workaround:
fails with:
and:
passes.
Expected post-build verification:
should include:
Checklist
./scripts/code_format.shbefore submitting.