-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
We have seen a segfault during runtime finalization when using UCX+Open MPi 5
To Reproduce
Steps to reproduce the behavior:
- Create a conda env with Open MPI 5.0.8
conda create -n ompi5
conda activate ompi5
conda install conda-forge::openmpi
- Compile realm with UCX
cmake ../ -DREALM_ENABLE_UCX=ON -DREALM_ENABLE_CUDA=OFF -DREALM_ENABLE_HIP=OFF -DREALM_ENABLE_OPENMP=OFF -DREALM_ENABLE_PYTHON=ON -DREALM_ENABLE_HDF5=OFF -DREALM_BUILD_TESTS=ON
- Run any of the realm program, I picked memspeed and remove everything except runtime init and shutdown.
memspeed
Expected behavior
Here is the backtrace
Thread 6 (Thread 0x7fffe0fddc00 (LWP 1607363) "memspeed" (Exiting)):
#0 0x00007ffff73af5b0 in ?? ()
#1 0x00007ffff7894be1 in advise_stack_range (guardsize=<optimized out>, pd=140736968121344, size=<optimized out>, mem=0x7fffe0edb000) at ./nptl/allocatestack.c:195
#2 start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:551
#3 0x00007ffff79268c0 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Thread 5 (Thread 0x7ffff01a2c00 (LWP 1607362) "memspeed" (Exiting)):
#0 0x00007ffff73af5b0 in ?? ()
#1 0x00007ffff7894be1 in advise_stack_range (guardsize=<optimized out>, pd=140737221635072, size=<optimized out>, mem=0x7ffff00a0000) at ./nptl/allocatestack.c:195
#2 start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:551
#3 0x00007ffff79268c0 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Thread 3 (Thread 0x7ffff13ff640 (LWP 1607357) "cuda0000380000f"):
#0 0x00007ffff7918c3f in __GI___poll (fds=0x555555f15e50, nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1 0x00007ffff284164f in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007ffff290f18f in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007ffff283c233 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007ffff7894ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#5 0x00007ffff79268c0 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Thread 1 (Thread 0x7ffff7ab3c00 (LWP 1607353) "memspeed"):
#0 __futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265, expected=1607363, futex_word=0x7fffe0fdded0) at ./nptl/futex-internal.c:57
#1 __futex_abstimed_wait_common (cancel=true, private=128, abstime=0x0, clockid=0, expected=1607363, futex_word=0x7fffe0fdded0) at ./nptl/futex-internal.c:87
#2 __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7fffe0fdded0, expected=1607363, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=128) at ./nptl/futex-internal.c:139
#3 0x00007ffff7896624 in __pthread_clockjoin_ex (threadid=140736968121344, thread_return=0x0, clockid=0, abstime=0x0, block=<optimized out>) at ./nptl/pthread_join_common.c:105
#4 0x000055555577b2cb in Realm::KernelThread::join (this=0x555555fadf60) at /home/weiwu/realm/src/realm/threads.cc:1062
#5 0x0000555555aaf17d in Realm::BackgroundWorkThread::join (this=0x555555d8ced0) at /home/weiwu/realm/src/realm/bgwork.cc:154
#6 0x0000555555ab0318 in Realm::BackgroundWorkManager::stop_dedicated_workers (this=0x555555d8d300) at /home/weiwu/realm/src/realm/bgwork.cc:335
#7 0x0000555555707b4d in Realm::RuntimeImpl::wait_for_shutdown (this=0x555555d8d020) at /home/weiwu/realm/src/realm/runtime_impl.cc:2826
#8 0x00005555556fc998 in Realm::Runtime::wait_for_shutdown (this=0x7fffffffde40) at /home/weiwu/realm/src/realm/runtime_impl.cc:734
#9 0x000055555559782a in main (argc=5, argv=0x7fffffffdf78) at /home/weiwu/realm/tests/memspeed.cc:584
If we remove the MPI_Finaliza
, the segfault is gone.
Here is a branch that could reproduce the segfault even without UCX. https://github.com/StanfordLegion/realm/commits/debug-mpi
d4d4718
In this branch, we explicitly initialize the MPI bootstrap during initialization and close it during the finalization, then we can produce the segfault using
memspeed -ll:networks none
The MPI bootstrap dlopen the mpi wrapper, which calls MPI_Init_thread and MPI_Finalize. If we replace the dlopen with a direct call to MPI, then the error is gone.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working