Skip to content

ADIOS 1 interfering with WarpX mpi io #1195

Open
@RTSandberg

Description

@RTSandberg

If I run WarpX locally on my machine with MPI enabled, then I get a segfault that is traced to the presence of ADIOS1 associated with an openpmd_api installation

To Reproduce

# fails with:
cmake -S . -B build -DWarpX_DIMS=RZ -DCMAKE_BUILD_TYPE=Debug
cmake --build build
cd build/bin
./warpx ../../Examples/Physics_applications/laser_acceleration/inputs_rz amrex.throw_exception = 1 amrex.signal_handling = 0

Configuration output:

(warpx-dev) ryansand@m-krasny05 WarpX % cmake -S . -B build -DWarpX_DIMS=RZ -DCMAKE_BUILD_TYPE=Debug
-- Found CCache: /opt/anaconda3/envs/warpx-dev/bin/ccache
-- Downloading AMReX ...
-- AMReX repository: https://github.com/AMReX-Codes/amrex.git (22.02)
-- CMake version: 3.21.3
-- AMReX installation directory: /usr/local
-- Build type set by user to 'Debug'.
-- Building AMReX with AMReX_SPACEDIM = 2
-- Configuring AMReX with the following options enabled: 
--    AMReX_PRECISION = DOUBLE
--    AMReX_MPI
--    AMReX_MPI_THREAD_MULTIPLE
--    AMReX_OMP
--    AMReX_LINEAR_SOLVERS
--    AMReX_PARTICLES
--    AMReX_PARTICLES_PRECISION = DOUBLE
--    AMReX_TINY_PROFILE
-- Found MPI: TRUE (found version "3.1") found components: C CXX 
-- AMReX configuration summary: 
--    Build type               = Debug
--    Install directory        = /usr/local
--    C++ compiler             = /opt/anaconda3/envs/warpx-dev/bin/x86_64-apple-darwin13.4.0-clang++
--    C++ defines              = 
--    C++ flags                = -g -march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fPIE -fstack-protector-strong -O2 -pipe -stdlib=libc++ -fvisibility-inlines-hidden -std=c++14 -fmessage-length=0 -isystem /opt/anaconda3/envs/warpx-dev/include -fopenmp=libomp
--    C++ include paths        = -I/Users/ryansand/Documents/plasma_codes/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Base -I/Users/ryansand/Documents/plasma_codes/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Base/Parser -I/Users/ryansand/Documents/plasma_codes/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Boundary -I/Users/ryansand/Documents/plasma_codes/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/AmrCore -I/Users/ryansand/Documents/plasma_codes/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/LinearSolvers/MLMG -I/Users/ryansand/Documents/plasma_codes/WarpX/WarpX/build/_deps/fetchedamrex-src/Src/Particle -I/opt/anaconda3/envs/warpx-dev/include
--    Link line                = /opt/anaconda3/envs/warpx-dev/lib/libmpi.dylib /opt/anaconda3/envs/warpx-dev/lib/libomp.dylib
-- AMReX: Using version '22.02' (22.02)
-- Downloading PICSAR ...
-- PICSAR repository: https://github.com/ECP-WarpX/picsar.git (15651b072cd9a45a5a5061d8cf7b928d136e39f3)
-- Downloading openPMD-api ...
-- openPMD-api repository: https://github.com/openPMD/openPMD-api.git (0.14.3)
-- Found MPI: TRUE (found version "3.1") found components: CXX 
-- Using the single-header code from /Users/ryansand/Documents/plasma_codes/WarpX/WarpX/build/_deps/fetchedopenpmd-src/share/openPMD/thirdParty/json/single_include/
-- nlohmann-json: Using INTERNAL version '3.9.1'
-- HDF5 C compiler wrapper is unable to compile a minimal HDF5 program.
CMake Warning at /opt/anaconda3/envs/warpx-dev/share/cmake-3.21/Modules/FindHDF5.cmake:742 (message):
  HDF5 found for language C is not parallel but previously found language is
  parallel.
Call Stack (most recent call first):
  build/_deps/fetchedopenpmd-src/CMakeLists.txt:192 (find_package)
-- Found 'adios_config': /opt/anaconda3/envs/warpx-dev/bin/adios_config
-- ADIOS linker flags (unparsed): -L/opt/anaconda3/envs/warpx-dev/lib -ladios -L/opt/anaconda3/envs/warpx-dev/lib64 -L/opt/anaconda3/envs/warpx-dev/lib64 -L/opt/anaconda3/envs/warpx-dev/lib -lz -lbz2 -lblosc -Wl,-pie -Wl,-headerpad_max_install_names -Wl,-dead_strip_dylibs -Wl,-rpath,/opt/anaconda3/envs/warpx-dev/lib -L/opt/anaconda3/envs/warpx-dev/lib -Wl,-rpath,/opt/anaconda3/envs/warpx-dev/lib
-- ADIOS compiler flags (unparsed): -I/opt/anaconda3/envs/warpx-dev/include -DZLIB -I/opt/anaconda3/envs/warpx-dev/include -DBZIP2 -I/opt/anaconda3/envs/warpx-dev/include -DBLOSC -I/opt/anaconda3/envs/warpx-dev/include -I/opt/anaconda3/envs/warpx-dev/include
-- ADIOS DIRS to look for libs: /opt/anaconda3/envs/warpx-dev/lib;/opt/anaconda3/envs/warpx-dev/lib64;/opt/anaconda3/envs/warpx-dev/lib64;/opt/anaconda3/envs/warpx-dev/lib;/opt/anaconda3/envs/warpx-dev/lib
-- Found adios in /opt/anaconda3/envs/warpx-dev/lib/libadios.a
-- Found z in /opt/anaconda3/envs/warpx-dev/lib/libz.dylib
-- Found bz2 in /opt/anaconda3/envs/warpx-dev/lib/libbz2.dylib
-- Found blosc in /opt/anaconda3/envs/warpx-dev/lib/libblosc.dylib
-- ADIOS compile definitions: -DZLIB -DBZIP2 -DBLOSC
-- Found MPI: TRUE (found version "3.1")  
-- <variant> supported (C++17 or newer): TRUE
openPMD build configuration:
  library Version: 0.14.3
  openPMD Standard: 1.1.0
  C++ Compiler: Clang 11.1.0 
    /opt/anaconda3/envs/warpx-dev/bin/x86_64-apple-darwin13.4.0-clang++

  Installation: OFF

  Build Type: Debug
  Library: static
  CLI Tools: OFF
  Examples: OFF
  Testing: OFF
  Invasive Tests: OFF
  Internal VERIFY: ON
  Build Options:
    MPI: ON
    HDF5: ON
    ADIOS1: ON
    ADIOS2: ON
    PYTHON: OFF


WarpX build configuration:
  Version: 22.02 (22.02-3-ge7c7d3f2bb85)
  C++ Compiler: Clang 11.1.0 
    /opt/anaconda3/envs/warpx-dev/bin/x86_64-apple-darwin13.4.0-clang++

  Installation prefix: /usr/local
        bin: bin
        lib: lib
    include: include
      cmake: lib/cmake/WarpX

  Build type: Debug
  Build options:
    APP: ON
    ASCENT: OFF
    COMPUTE: OMP
    DIMS: RZ
    Embedded Boundary: OFF
    GPU clock timers: OFF
    IPO/LTO: OFF
    LIB: OFF
    MPI: ON
    PSATD: OFF
    PRECISION: DOUBLE
    OPENPMD: ON
    QED: ON
    QED table generation: OFF
    SENSEI: OFF

-- Configuring done
-- Generating done
-- Build files have been written to: <WarpX root>/WarpX/build

Output:

MPI initialized with 1 MPI processes
MPI initialized with thread support level 3
OMP initialized with 16 OMP threads
AMReX (22.02) initialized
WarpX (22.02-3-ge7c7d3f2bb85)
PICSAR (15651b072cd9)
Level 0: dt = 4.112304655e-16 ; dx = 4.6875e-07 ; dz = 1.328125e-07

Grids Summary:
  Level 0   8 grids  32768 cells  100 % of domain
            smallest grid: 64 x 64  biggest grid: 64 x 64

  Writing plotfile diags/diag100000

STEP 1 starts ...
...

STEP 10 starts ...
  Writing plotfile diags/diag100010
STEP 10 ends. TIME = 4.112304655e-15 DT = 4.112304655e-16
Evolve time = 0.562361577 s; This step = 0.160967616 s; Avg. per step = 0.0562361577 s

**** WARNINGS ******************************************************************
* GLOBAL warning list  after  [ THE END ]
*
* No recorded warnings.
********************************************************************************

Total Time                     : 0.75111822
[m-krasny05:49202] *** Process received signal ***
[m-krasny05:49202] Signal: Segmentation fault: 11 (11)
[m-krasny05:49202] Signal code: Address not mapped (1)
[m-krasny05:49202] Failing at address: 0x1
[m-krasny05:49202] [ 0] 0   libsystem_platform.dylib            0x00007ff80d1d2e2d _sigtramp + 29
[m-krasny05:49202] [ 1] 0   ???                                 0x0000000000000002 0x0 + 2
[m-krasny05:49202] [ 2] 0   libopenPMD.ADIOS1.Serial.dylib      0x000000011030e062 MPI_Allreduce + 114
[m-krasny05:49202] [ 3] 0   warpx.RZ.MPI.OMP.DP.OPMD.QED.DEBUG  0x000000010f661bdc _ZN5amrex18ParallelDescriptor13ReduceBoolAndERb + 76
[m-krasny05:49202] [ 4] 0   warpx.RZ.MPI.OMP.DP.OPMD.QED.DEBUG  0x000000010f71b430 _ZN5amrex12TinyProfiler8FinalizeEb + 160
[m-krasny05:49202] [ 5] 0   warpx.RZ.MPI.OMP.DP.OPMD.QED.DEBUG  0x000000010f625b18 _ZN5amrex8FinalizeEPNS_5AMReXE + 40
[m-krasny05:49202] [ 6] 0   warpx.RZ.MPI.OMP.DP.OPMD.QED.DEBUG  0x000000010f2def42 main + 498
[m-krasny05:49202] [ 7] 0   dyld                                0x000000011dfbc4fe start + 462
[m-krasny05:49202] *** End of error message ***
zsh: segmentation fault  ./warpx ../../Examples/Physics_applications/laser_acceleration/inputs_rz  = 1

Expected behavior
If ADIOS1 is explicitly disabled,

cmake -S . -B build -DWarpX_DIMS=RZ -DCMAKE_BUILD_TYPE=Debug -DopenPMD_USE_ADIOS1=OFF

then this works.

Note that HDF5 and ADIOS2 are found, so this seems to be an MPI shutdown issue in ADIOS1 even when it is not used.

Software Environment

  • version of openPMD-api: 0.14.3
  • installed openPMD-api via: conda-forge
  • operating system: macOS Monterey 12.1
  • machine: personal MacBook Pro
  • version of HDF5: [e.g. 1.12.0]
  • version of ADIOS1: [e.g. 1.13.1]
  • version of ADIOS2: [e.g. 2.7.1]
  • name and version of MPI: [e.g. OpenMPI 4.1.1]

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions