Skip to content

Commit 635e120

Browse files
authored
[PGO][HIP] Stop pulling ROCm.o into every PGO host link (llvm#200101)
PR llvm#177665 added an unconditional `extern` reference to `__llvm_profile_hip_collect_device_data` from `InstrProfilingFile.c`, which forces `InstrProfilingPlatformROCm.o` (and its sanitizer_common / interception dependencies) out of `libclang_rt.profile.a` in every PGO binary. That breaks bots without `-lpthread` and races dlsym/PLT state in non-HIP programs via the interceptor constructor. Fix: - Declare the hook `COMPILER_RT_WEAK` and gate the call on its address. No `COMPILER_RT_VISIBILITY`: a hidden weak-undef function would be non-preemptible and the address test would fold to true. - Gate `installHipModuleInterceptors` on `dlsym(hipModuleLoad)` so the constructor is a no-op if `ROCm.o` is still pulled in. Fixes: - https://lab.llvm.org/buildbot/#/builders/66/builds/31311 - https://lab.llvm.org/buildbot/#/builders/174/builds/36180 Verified: - `check-profile` 134/134 pass. - `nm` on a non-HIP `clang -fprofile-generate` binary: zero `installHip`/`ROCm`/`sanitizer`/`hip_collect` symbols. - HIP offload PGO end-to-end on gfx1101 (compile → run → `llvm-profdata merge` → `llvm-cov`) still works; interceptor installs, device profile collected via shared API.
1 parent 2d5dac5 commit 635e120

2 files changed

Lines changed: 29 additions & 5 deletions

File tree

compiler-rt/lib/profile/InstrProfilingFile.c

Lines changed: 22 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -41,9 +41,20 @@
4141
#include "InstrProfilingPort.h"
4242
#include "InstrProfilingUtil.h"
4343

44-
/* HIP / offload collection hook implemented in InstrProfilingPlatformROCm.c.
45-
* It is a no-op when no offload profile data was registered. */
44+
/* Weak so non-HIP programs do not force InstrProfilingPlatformROCm.o (and its
45+
* transitive sanitizer_common / interception dependencies) into the host link
46+
* out of libclang_rt.profile.a. HIP programs emit strong references to other
47+
* ROCm-runtime symbols (e.g. __llvm_profile_offload_register_shadow_variable)
48+
* that pull in the strong definition.
49+
* No COMPILER_RT_VISIBILITY: a hidden weak-undefined symbol is non-preemptible
50+
* and the address test at the call site would fold to true.
51+
* Windows: __declspec(selectany) is data-only, and the ROCm interceptor path
52+
* is not used there, so keep the original strong extern. */
53+
#if defined(_WIN32)
4654
extern int __llvm_profile_hip_collect_device_data(void);
55+
#else
56+
__attribute__((weak)) int __llvm_profile_hip_collect_device_data(void);
57+
#endif
4758

4859
/* From where is profile name specified.
4960
* The order the enumerators define their
@@ -1202,10 +1213,16 @@ int __llvm_profile_write_file(void) {
12021213
if (rc)
12031214
PROF_ERR("Failed to write file \"%s\": %s\n", Filename, strerror(errno));
12041215

1205-
/* No-op when no HIP shadow variables or dynamic modules are registered,
1206-
* or when the HIP runtime is not loaded. Warning on failure is handled
1207-
* inside the callee so non-HIP programs do not see spurious noise. */
1216+
/* On non-Windows the declaration is weak: only invoked when
1217+
* InstrProfilingPlatformROCm.o is in the link, which happens when the program
1218+
* references other ROCm-runtime symbols (HIP-with-PGO). Warning on failure is
1219+
* handled inside the callee. */
1220+
#if defined(_WIN32)
12081221
(void)__llvm_profile_hip_collect_device_data();
1222+
#else
1223+
if (&__llvm_profile_hip_collect_device_data)
1224+
(void)__llvm_profile_hip_collect_device_data();
1225+
#endif
12091226

12101227
// Restore SIGKILL.
12111228
if (PDeathSig == 1)

compiler-rt/lib/profile/InstrProfilingPlatformROCm.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ extern "C" {
2424
#define WIN32_LEAN_AND_MEAN
2525
#include <windows.h>
2626
#else
27+
#include <dlfcn.h>
2728
#include <pthread.h>
2829
#endif
2930

@@ -878,6 +879,12 @@ INTERCEPTOR(int, hipModuleUnload, void *module) {
878879
}
879880

880881
__attribute__((constructor)) static void installHipModuleInterceptors() {
882+
/* Skip when the HIP runtime is not loaded. INTERCEPT_FUNCTION uses the
883+
* sanitizer interception framework, which can perturb dlsym/PLT state for
884+
* the rest of the process even when the target symbol is absent; non-HIP
885+
* programs linked with libclang_rt.profile.a must see zero side effects. */
886+
if (!dlsym(RTLD_DEFAULT, "hipModuleLoad"))
887+
return;
881888
if (!INTERCEPT_FUNCTION(hipModuleLoad))
882889
return;
883890
if (isVerboseMode())

0 commit comments

Comments
 (0)