Open
Description
Hi,
I got crashs with large 3D simulations on LUMI. The crash is concerning a MPI_Allgather routine :
MPICH ERROR [Rank 0] [job id 4292261.0] [Thu Aug 3 00:01:58 2023] [nid006593] - Abort(1616271) (rank 0 in comm 0): Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(170).......:
MPID_Init(501)..............:
MPIDI_OFI_mpi_init_hook(805):
MPIDU_bc_table_create(204)..: PMI_Allgather failed: -1
This crash happens before warpx starts and does not produce traces.
Here is the error output of the simulations and the submit file : warpx-4292261.txt batch.txt
Here are the modules used for the compilation : Recipe_warpx.txt