Skip to content

ocean/planar/manufactured_solution/convergence_both/default failing on Chrysalis #303

@xylar

Description

@xylar

Note: this may be a Polaris issue rather than something wrong in Omega.

The test is failing with:

Stack trace (most recent call first):
#0 0x00000000004521e2 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#1 0x0000000000452dfa at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#2 0x000000000041d3c9 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#3 0x00001555518cfd84 at /usr/lib64/libc.so.6
#4 0x000000000041d1dd at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
Stack trace (most recent call first):
#0 0x00000000004521e2 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#1 0x0000000000452dfa at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#2 0x000000000041d3c9 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#3 0x00001555518cfd84 at /usr/lib64/libc.so.6
#4 0x000000000041d1dd at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
Stack trace (most recent call first):
Stack trace (most recent call first):
#0 0x00000000004521e2 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#1 0x0000000000452dfa at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#2 0x000000000041d3c9 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#3 0x00001555518cfd84 at /usr/lib64/libc.so.6
#4 0x000000000041d1dd at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#0 0x00000000004521e2 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#1 0x0000000000452dfa at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#2 0x000000000041d3c9 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#3 0x00001555518cfd84 at /usr/lib64/libc.so.6
#4 0x000000000041d1dd at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
Stack trace (most recent call first):
#0 0x00000000004521e2 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#1 0x0000000000452dfa at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#2 0x000000000041d3c9 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#3 0x00001555518cfd84 at /usr/lib64/libc.so.6
#4 0x000000000041d1dd at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
Stack trace (most recent call first):
#0 0x00000000004521e2 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#1 0x0000000000452dfa at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#2 0x000000000041d3c9 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#3 0x00001555518cfd84 at /usr/lib64/libc.so.6
#4 0x000000000041d1dd at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
Stack trace (most recent call first):
Stack trace (most recent call first):
#0 0x00000000004521e2 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#1 0x0000000000452dfa at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#2 0x000000000041d3c9 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#3 0x00001555518cfd84 at /usr/lib64/libc.so.6
#4 0x000000000041d1dd at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#0 0x00000000004521e2 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#1 0x0000000000452dfa at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#2 0x000000000041d3c9 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#3 0x00001555518cfd84 at /usr/lib64/libc.so.6
#4 0x000000000041d1dd at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
Stack trace (most recent call first):
#0 0x00000000004521e2 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#1 0x0000000000452dfa at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#2 0x000000000041d3c9 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#3 0x00001555518cfd84 at /usr/lib64/libc.so.6
#4 0x000000000041d1dd at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
Stack trace (most recent call first):
#0 0x00000000004521e2 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#1 0x0000000000452dfa at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#2 0x000000000041d3c9 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#3 0x00001555518cfd84 at /usr/lib64/libc.so.6
#4 0x000000000041d1dd at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
Stack trace (most recent call first):
#0 0x00000000004521e2 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#1 0x0000000000452dfa at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#2 0x000000000041d3c9 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#3 0x00001555518cfd84 at /usr/lib64/libc.so.6
#4 0x000000000041d1dd at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 6 in communicator MPI_COMM_WORLD
with errorcode 3.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 7 in communicator MPI_COMM_WORLD
with errorcode 3.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 5 in communicator MPI_COMM_WORLD
with errorcode 3.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
with errorcode 3.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 4 in communicator MPI_COMM_WORLD
with errorcode 3.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 11 in communicator MPI_COMM_WORLD
with errorcode 3.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD
with errorcode 3.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 9 in communicator MPI_COMM_WORLD
with errorcode 3.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
with errorcode 3.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 10 in communicator MPI_COMM_WORLD
with errorcode 3.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 8 in communicator MPI_COMM_WORLD
with errorcode 3.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[chr-0249:1486658:0:1486658] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[chr-0249:1486661:0:1486661] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[chr-0249:1486663:0:1486663] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[chr-0249:1486662:0:1486662] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[chr-0249:1486660:0:1486660] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[chr-0249:1486659:0:1486659] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[chr-0249:1486666:0:1486666] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[chr-0249:1486667:0:1486667] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[chr-0249:1486664:0:1486664] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[chr-0249:1486657:0:1486657] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[chr-0249:1486665:0:1486665] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 3.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
slurmstepd: error: *** STEP 993459.0 ON chr-0249 CANCELLED AT 2025-10-27T05:57:21 ***
$ cat omega.log 
[info] Omega using calendar type: No Leap
[error] [IO.cpp:343] PIO error while reading dimension TWO 
[error] [IO.cpp:787] IO::readArray: Error finding varid for variable CellsOnCell
[error] [IO.cpp:787] IO::readArray: Error finding varid for variable EdgesOnCell
[error] [IO.cpp:787] IO::readArray: Error finding varid for variable VerticesOnCell
[error] [IO.cpp:787] IO::readArray: Error finding varid for variable CellsOnEdge
[error] [IO.cpp:787] IO::readArray: Error finding varid for variable EdgesOnEdge
[error] [IO.cpp:787] IO::readArray: Error finding varid for variable VerticesOnEdge
[error] [IO.cpp:787] IO::readArray: Error finding varid for variable CellsOnVertex
[error] [IO.cpp:787] IO::readArray: Error finding varid for variable EdgesOnVertex
[error] [IOStream.cpp:192] Cannot validate stream History: Field SshCellDefault has not been defined
[error] [IOStream.cpp:223] IOStream validateAll: stream History has invalid entries
[critical] [OceanInit.cpp:122] ocnInit: Error validating IO Streams
Stack trace (most recent call first):
#0 0x00000000004521e2 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#1 0x0000000000452dfa at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#2 0x000000000041d3c9 at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
#3 0x00001555518cfd84 at /usr/lib64/libc.so.6
#4 0x000000000041d1dd at /gpfs/fs1/home/ac.xylar/e3sm_work/polaris/main/build_omega/build_chrysalis_intel/src/omega.exe
[critical] [Error.cpp:43] Omega aborting

See

/lcrc/group/e3sm/ac.xylar/polaris_0.9/chrysalis/test_20251027/omega_pr/case_outputs/ocean_planar_manufactured_solution_convergence_both_default.log

and test results in:

/lcrc/group/e3sm/ac.xylar/polaris_0.9/chrysalis/test_20251027/omega_pr/ocean/planar/manufactured_solution/default/forward/200km_300s

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions