-
Notifications
You must be signed in to change notification settings - Fork 225
Open
Description
When running 3D refinement on various types of extracted sets of particles I receive the error below about corrupted size vs. prev_size. I've looked in RELION github for this issue but it seems unresolved. It seems it may have been due to memory so I incresed the memory and that did not resolve it. Has anyone seen this issue before?
Dataset details:
~17k particles
box size 288 pixels
px size 1.382
particles have no symmetry
Submission script:
#!/bin/bash
#SBATCH --ntasks=5
#SBATCH --nodes=1
#SBATCH --cpus-per-task=1
#SBATCH --time=239:00:00
#SBATCH --mem=200G
#SBATCH --partition=gpu
#SBATCH --gres=gpu:4
#SBATCH --export=NONE
cd $SLURM_SUBMIT_DIR
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
module load relion/5.0.1
unset SLURM_EXPORT_ENV
mkdir Refine3D/Oct2025_job003_bin1_tightermask
mpirun -n 5 relion_refine_mpi --o Refine3D/Oct2025_job003_bin1_tightermask/run --auto_refine --split_random_halves --gpu "" --ios Extract/Oct2025_02_bin1/optimisation_set.star --ref inputmodels/initialmap_bin1_box288.mrc --firstiter_cc --trust_ref_size --ini_high 15 --dont_combine_weights_via_disc --pool 3 --pad 2 --ctf --particle_diameter 260 --flatten_solvent --zero_mask --solvent_mask masks/Oct2025_bin2_tightest_bin1_box288_final.mrc --solvent_correct_fsc --oversampling 1 --healpix_order 3 --auto_local_healpix_order 3 --offset_range 3 --offset_step 2 --sym C1 --low_resol_join_halves 40 --norm --scale --j 1 --pipeline_control Refine3D/job015/
Error:
Usage: model_angelo -h
Lmod has detected the following error: The following module(s) are unknown:
"module"
Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
$ module --ignore_cache load "module"
Also make sure that all modulefiles written in TCL start with the string
#%Module
mkdir: cannot create directory 'Refine3D/Oct2025_job003_bin1_tightermask': File exists
[1760899660.085819] [gpu123:458887:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760899660.207126] [gpu123:458888:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760899660.265701] [gpu123:458889:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760899660.278145] [gpu123:458891:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760899660.295947] [gpu123:458890:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760899660.439772] [gpu123:458890:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760899660.439956] [gpu123:458887:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760899660.439985] [gpu123:458888:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760899660.439904] [gpu123:458891:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760899660.439956] [gpu123:458889:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
RELION version: 5.0-beta-3-commit-12cf15
Precision: BASE=double, CUDA-ACC=single
+ Follower 3 runs on host = gpu123
+ Follower 2 runs on host = gpu123
+ Follower 1 runs on host = gpu123
+ Follower 4 runs on host = gpu123
=== RELION MPI setup ===
+ Number of MPI processes = 5
+ Leader (0) runs on host = gpu123
==========================
uniqueHost gpu123 has 4 ranks.
GPU-ids not specified for this rank, threads will automatically be mapped to available devices.
Thread 0 on follower 1 mapped to device 0
GPU-ids not specified for this rank, threads will automatically be mapped to available devices.
Thread 0 on follower 2 mapped to device 1
GPU-ids not specified for this rank, threads will automatically be mapped to available devices.
Thread 0 on follower 3 mapped to device 2
GPU-ids not specified for this rank, threads will automatically be mapped to available devices.
Thread 0 on follower 4 mapped to device 3
Running CPU instructions in double precision.
Estimating initial noise spectra from at most 2860 particles
95/ 31 sec ........................................................................................................................ 117/ 32 sec ........................................................................................................................ 172/ 32 sec ..............................................................................................................................................................................................................................................................3.12/3.12 min ............................................................~~(,_,">
Auto-refine: Iteration= 1
Auto-refine: Resolution= 14.7413 (no gain for 0 iter)
Auto-refine: Changes in angles= 999 degrees; and in offsets= 999 Angstroms (no gain for 0 iter)
CurrentResolution= 14.7413 Angstroms, which requires orientationSampling of at least 6.42857 degrees for a particle of diameter 260 Angstroms
Oversampling= 0 NrHiddenVariableSamplingPoints= 2755
OrientationalSampling= 7.5 NrOrientations= 145
TranslationalSampling= 2.764 NrTranslations= 19
=============================
Oversampling= 1 NrHiddenVariableSamplingPoints= 176320
OrientationalSampling= 3.75 NrOrientations= 1160
TranslationalSampling= 1.382 NrTranslations= 152
=============================
Expectation iteration 1
52.70/52.70 min ............................................................~~(,_,">
Averaging half-reconstructions up to 40 Angstrom resolution to prevent diverging orientations ...
Note that only for higher resolutions the FSC-values are according to the gold-standard!
Calculating solvent-corrected gold-standard FSC ...
+ randomize phases beyond: 30.6166 Angstroms
Maximization...
000/??? sec ~~(,_,"> [oo]corrupted size vs. prev_size
[gpu123:458889] *** Process received signal ***
[gpu123:458889] Signal: Aborted (6)
[gpu123:458889] Signal code: (-6)
[gpu123:458889] [ 0] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x3c050)[0x152843a5a050]
[gpu123:458889] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x8aeec)[0x152843aa8eec]
[gpu123:458889] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x12)[0x152843a59fb2]
[gpu123:458889] [ 3] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x152843a44472]
[gpu123:458889] [ 4] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x7f42f)[0x152843a9d42f]
[gpu123:458889] [ 5] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x9486a)[0x152843ab286a]
[gpu123:458889] [ 6] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x9512e)[0x152843ab312e]
[gpu123:458889] [ 7] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x952b0)[0x152843ab32b0]
[gpu123:458889] [ 8] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x978d8)[0x152843ab58d8]
[gpu123:458889] [ 9] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x985ef)[0x152843ab65ef]
[gpu123:458889] [10] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x98ca5)[0x152843ab6ca5]
[gpu123:458889] [11] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_malloc_plain+0x15)[0x1528440250a5]
[gpu123:458889] [12] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x268c7)[0x1528440268c7]
[gpu123:458889] [13] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_solvtab_exec+0x30)[0x1528440296f0]
[gpu123:458889] [14] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_rdft_conf_standard+0x22)[0x15284406d7f2]
[gpu123:458889] [15] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_configure_planner+0x11)[0x152844116fd1]
[gpu123:458889] [16] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_the_planner+0x28)[0x15284411e338]
[gpu123:458889] [17] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_mkapiplan+0x2e)[0x152844116abe]
[gpu123:458889] [18] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_plan_many_dft_r2c+0x143)[0x15284411dde3]
[gpu123:458889] [19] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_plan_dft_r2c+0x25)[0x15284411d385]
[gpu123:458889] [20] relion_refine_mpi(_ZN18FourierTransformer7setRealER13MultidimArrayIdEb+0xc5)[0x55884e08a985]
[gpu123:458889] [21] relion_refine_mpi(_ZN13BackProjector11reconstructER13MultidimArrayIdEibRKS1_ddibP5ImageIdE+0x11c)[0x55884e04ce7c]
[gpu123:458889] [22] relion_refine_mpi(_ZN14MlOptimiserMpi12maximizationEv+0x1260)[0x55884dfac390]
[gpu123:458889] [23] relion_refine_mpi(_ZN14MlOptimiserMpi7iterateEv+0x3a1)[0x55884dfadf71]
[gpu123:458889] [24] relion_refine_mpi(main+0x52)[0x55884df5af42]
[gpu123:458889] [25] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x2724a)[0x152843a4524a]
[gpu123:458889] [26] /usr/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x152843a45305]
[gpu123:458889] [27] relion_refine_mpi(_start+0x21)[0x55884df5e951]
[gpu123:458889] *** End of error message ***
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 458889 on node gpu123 exited on signal 6 (Aborted).
I tried to take just a subset of 400 particles and run it again, this time I got the following error
Usage: model_angelo -h
Lmod has detected the following error: The following module(s) are unknown:
"module"
Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
$ module --ignore_cache load "module"
Also make sure that all modulefiles written in TCL start with the string
#%Module
[1760904992.810881] [gpu288:1653125:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760904992.858955] [gpu288:1653129:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760904992.877990] [gpu288:1653126:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760904992.908439] [gpu288:1653127:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760904992.923536] [gpu288:1653128:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760904992.949372] [gpu288:1653128:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760904992.950052] [gpu288:1653126:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760904992.949871] [gpu288:1653127:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760904992.949968] [gpu288:1653125:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
[1760904992.951524] [gpu288:1653129:0] ucp_context.c:1849 UCX WARN UCP API version is incompatible: required >= 1.17, actual 1.13.1 (loaded from /usr/lib/x86_64-linux-gnu/libucp.so.0)
RELION version: 5.0-beta-3-commit-12cf15
Precision: BASE=double, CUDA-ACC=single
=== RELION MPI setup ===
+ Number of MPI processes = 5
+ Leader (0) runs on host = gpu288
+ Follower 1 runs on host = gpu288
+ Follower 2 runs on host = gpu288
+ Follower 3 runs on host = gpu288
+ Follower 4 runs on host = gpu288
==========================
uniqueHost gpu288 has 4 ranks.
GPU-ids not specified for this rank, threads will automatically be mapped to available devices.
Thread 0 on follower 1 mapped to device 0
GPU-ids not specified for this rank, threads will automatically be mapped to available devices.
Thread 0 on follower 2 mapped to device 1
GPU-ids not specified for this rank, threads will automatically be mapped to available devices.
Thread 0 on follower 3 mapped to device 2
GPU-ids not specified for this rank, threads will automatically be mapped to available devices.
Thread 0 on follower 4 mapped to device 3
Running CPU instructions in double precision.
Estimating initial noise spectra from at most 2860 particles
4/ 4 sec ............................................................~~(,_,">
Auto-refine: Iteration= 1
Auto-refine: Resolution= 14.7413 (no gain for 0 iter)
Auto-refine: Changes in angles= 999 degrees; and in offsets= 999 Angstroms (no gain for 0 iter)
CurrentResolution= 14.7413 Angstroms, which requires orientationSampling of at least 6.42857 degrees for a particle of diameter 260 Angstroms
Oversampling= 0 NrHiddenVariableSamplingPoints= 2945
OrientationalSampling= 7.5 NrOrientations= 155
TranslationalSampling= 2.764 NrTranslations= 19
=============================
Oversampling= 1 NrHiddenVariableSamplingPoints= 188480
OrientationalSampling= 3.75 NrOrientations= 1240
TranslationalSampling= 1.382 NrTranslations= 152
=============================
Expectation iteration 1
1.03/1.03 min ............................................................~~(,_,">
Averaging half-reconstructions up to 40 Angstrom resolution to prevent diverging orientations ...
Note that only for higher resolutions the FSC-values are according to the gold-standard!
[gpu288:1653127:0:1653127] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x157048e0cb73)
corrupted double-linked list (not small)
[gpu288:1653126] *** Process received signal ***
[gpu288:1653126] Signal: Aborted (6)
[gpu288:1653126] Signal code: (-6)
[gpu288:1653126] [ 0] ==== backtrace (tid:1653127) ====
0 /usr/lib/x86_64-linux-gnu/libucs.so.0(ucs_handle_error+0x2dc) [0x15004404fe9c]
1 /usr/lib/x86_64-linux-gnu/libucs.so.0(+0x2908c) [0x15004405008c]
2 /usr/lib/x86_64-linux-gnu/libucs.so.0(+0x2923a) [0x15004405023a]
3 /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x2785b) [0x150048c2785b]
4 /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x27adc) [0x150048c27adc]
5 /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_mkplan_d+0xf) [0x150048c281bf]
6 /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x79269) [0x150048c79269]
7 /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x2785e) [0x150048c2785e]
8 /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x27f19) [0x150048c27f19]
9 /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_mkplan_d+0xf) [0x150048c281bf]
10 /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x791f2) [0x150048c791f2]
11 /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x2785e) [0x150048c2785e]
12 /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x27f19) [0x150048c27f19]
13 /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_mkapiplan+0xfd) [0x150048d16b8d]
14 /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_plan_many_dft_r2c+0x143) [0x150048d1dde3]
15 /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_plan_dft_r2c+0x25) [0x150048d1d385]
16 relion_refine_mpi(_ZN18FourierTransformer7setRealER13MultidimArrayIdEb+0xc5) [0x562d60fba985]
17 relion_refine_mpi(_ZN13BackProjector11reconstructER13MultidimArrayIdEibRKS1_ddibP5ImageIdE+0x11c) [0x562d60f7ce7c]
18 relion_refine_mpi(_ZN14MlOptimiserMpi58reconstructUnregularisedMapAndCalculateSolventCorrectedFSCEv+0x2866) [0x562d60ed9166]
19 relion_refine_mpi(_ZN14MlOptimiserMpi7iterateEv+0x28d) [0x562d60edde5d]
20 relion_refine_mpi(main+0x52) [0x562d60e8af42]
21 /usr/lib/x86_64-linux-gnu/libc.so.6(+0x2724a) [0x15004864524a]
22 /usr/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x150048645305]
23 relion_refine_mpi(_start+0x21) [0x562d60e8e951]
=================================
[gpu288:1653127] *** Process received signal ***
[gpu288:1653127] Signal: Segmentation fault (11)
[gpu288:1653127] Signal code: (-6)
[gpu288:1653127] Failing at address: 0xf57ae00193987
[gpu288:1653127] [ 0] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x3c050)[0x15004865a050]
[gpu288:1653127] [ 1] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x2785b)[0x150048c2785b]
[gpu288:1653127] [ 2] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x27adc)[0x150048c27adc]
[gpu288:1653127] [ 3] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_mkplan_d+0xf)[0x150048c281bf]
[gpu288:1653127] [ 4] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x79269)[0x150048c79269]
[gpu288:1653127] [ 5] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x2785e)[0x150048c2785e]
[gpu288:1653127] [ 6] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x27f19)[0x150048c27f19]
[gpu288:1653127] [ 7] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_mkplan_d+0xf)[0x150048c281bf]
[gpu288:1653127] [ 8] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x791f2)[0x150048c791f2]
[gpu288:1653127] [ 9] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x2785e)[0x150048c2785e]
[gpu288:1653127] [10] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x27f19)[0x150048c27f19]
[gpu288:1653127] [11] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_mkapiplan+0xfd)[0x150048d16b8d]
[gpu288:1653127] [12] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_plan_many_dft_r2c+0x143)[0x150048d1dde3]
[gpu288:1653127] [13] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_plan_dft_r2c+0x25)[0x150048d1d385]
[gpu288:1653127] [14] relion_refine_mpi(_ZN18FourierTransformer7setRealER13MultidimArrayIdEb+0xc5)[0x562d60fba985]
[gpu288:1653127] [15] relion_refine_mpi(_ZN13BackProjector11reconstructER13MultidimArrayIdEibRKS1_ddibP5ImageIdE+0x11c)[0x562d60f7ce7c]
[gpu288:1653127] [16] relion_refine_mpi(_ZN14MlOptimiserMpi58reconstructUnregularisedMapAndCalculateSolventCorrectedFSCEv+0x2866)[0x562d60ed9166]
[gpu288:1653127] [17] relion_refine_mpi(_ZN14MlOptimiserMpi7iterateEv+0x28d)[0x562d60edde5d]
[gpu288:1653127] [18] relion_refine_mpi(main+0x52)[0x562d60e8af42]
[gpu288:1653127] [19] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x2724a)[0x15004864524a]
[gpu288:1653127] [20] /usr/lib/usr/lib/x86_64-linux-gnu/libc.so.6(+0x3c050)[0x14a4cb05a050]
[gpu288:1653126] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x8aeec)[0x14a4cb0a8eec]
[gpu288:1653126] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x12)[0x14a4cb059fb2]
[gpu288:1653126] [ 3] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x14a4cb044472]
[gpu288:1653126] [ 4] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x7f42f)[0x14a4cb09d42f]
[gpu288:1653126] [ 5] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x9486a)[0x14a4cb0b286a]
[gpu288:1653126] [ 6] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x9513a)[0x14a4cb0b313a]
[gpu288:1653126] [ 7] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x97e8d)[0x14a4cb0b5e8d]
[gpu288:1653126] [ 8] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x985ef)[0x14a4cb0b65ef]
[gpu288:1653126] [ 9] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x98ca5)[0x14a4cb0b6ca5]
[gpu288:1653126] [10] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_malloc_plain+0x15)[0x14a4cb6250a5]
[gpu288:1653126] [11] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_mktensor+0x3a)[0x14a4cb6297aa]
[gpu288:1653126] [12] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_tensor_compress+0x49)[0x14a4cb62a389]
[gpu288:1653126] [13] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_mkproblem_rdft2+0x128)[0x14a4cb67a498]
[gpu288:1653126] [14] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_mkproblem_rdft2_d+0x18)[0x14a4cb67a4f8]
[gpu288:1653126] [15] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x791e7)[0x14a4cb6791e7]
[gpu288:1653126] [16] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x2785e)[0x14a4cb62785e]
[gpu288:1653126] [17] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x27f19)[0x14a4cb627f19]
[gpu288:1653126] [18] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_mkplan_d+0xf)[0x14a4cb6281bf]
[gpu288:1653126] [19] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x791f2)[0x14a4cb6791f2]
[gpu288:1653126] [20] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x2785e)[0x14a4cb62785e]
[gpu288:1653126] [21] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(+0x27f19)[0x14a4cb627f19]
[gpu288:1653126] [22] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_mkapiplan+0xfd)[0x14a4cb716b8d]
[gpu288:1653126] [23] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_plan_many_dft_r2c+0x143)[0x14a4cb71dde3]
[gpu288:1653126] [24] /usr/lib/x86_64-linux-gnu/libfftw3.so.3(fftw_plan_dft_r2c+0x25)[0x14a4cb71d385]
[gpu288:1653126] [25] relion_refine_mpi(_ZN18FourierTransformer7setRealER13MultidimArrayIdEb+0xc5)[0x56200101b985]
[gpu288:1653126] [26] relion_refine_mpi(_ZN13BackProjector11reconstructER13MultidimArrayIdEibRKS1_ddibP5ImageIdE+0x11c)[0x562000fdde7c]
[gpu288:1653126] [27] relion_refine_mpi(_ZN14MlOptimiserMpi58reconstructUnregularisedMapAndCalculateSolventCorrectedFSCEv+0x2866)[0x562000f3a166]
[gpu288:1653126] [28] relion_refine_mpi(_ZN14MlOptimiserMpi7iterateEv+0x28d)[0x562000f3ee5d]
[gpu288:1653126] [29] relion_refine_mpi(main+0x52)[0x562000eebf42]
[gpu288:1653126] *** End of error message ***
/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x150048645305]
[gpu288:1653127] [21] relion_refine_mpi(_start+0x21)[0x562d60e8e951]
[gpu288:1653127] *** End of error message ***
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 1653127 on node gpu288 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
Metadata
Metadata
Assignees
Labels
No labels