-
-
Notifications
You must be signed in to change notification settings - Fork 124
Make MPI.jl compatible with profilers that use LD_PRELOAD #450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
explicitly give library in ccall's
Thanks for the contribution! Looks like we just need to figure out why this is not working on windows. |
@staticfloat is it possible to |
@vtjnash correct me if I'm wrong, but I do believe that Windows doesn't support the concept of I think the best way to support both of these is to have a macro that generates either |
It can be emulated (with Libdl.dllist—LLVMSupport even has an implementation), but not recommended and not fast. Darwin has RTLD_GLOBAL (though not default), and can do a |
What is the state of this? Stacktrace:
|
#451 got merged so this PR has been superseded. What ABI are you setting? |
I'm using MPICH ABI.
But I actually progressed something. Apparently I had an old MPI.jl release because I'm using MPIClusterManagers. After forcing MPI.jl to the
|
Can you run |
@simonbyrne The problem persists.
Furthermore, if I force Extrae to intercept MPI calls (for recording callstack), the old segmentation fault reappears.
|
The error looks like it can't write to a specific file. Are you able to run it on a single process? |
Nope, I receive the same error. |
See #444. Use of
ccall( (func, lib), ...)
is not compatible with applications that useLD_PRELOAD
to inject their own code for profiling and tracing. Darshan is an example of an application that does this for MPI and HDF5 I/O profiling.This PR has changes to explicitly load the
libmpi
shared object withLibdl.dlopen
inMPI.__init__
. Mostccall
statements are changed to not specify the MPI library. There are someccall
statements inimplementations.jl
that I left alone because they get called beforeMPI.__init__
and all they do is determine MPI version and implementation details.I tested on NERSC Cori Haswell nodes. MPI tests pass except for
test_spawn.jl
, but this fails with the released MPI.jl too. I was able to get Darshan to work with this PR.Happy New Year too!