Skip to content

Segfault in get_centerCoords_from_ESMFMesh_file with large mesh #485

@angus-g

Description

@angus-g

I have a mesh with 15863040 elements in 2 dimensions, and am creating a Mesh object from ESMPy:

    model_mesh = esmpy.Mesh(
        filename=mesh_filename,
        filetype=esmpy.FileFormat.ESMFMESH,
    )

This ends up causing a segfault in ESMF at line 1028 here:

PIO_Offset cc_offsets[num_elems*coordDim];
for (int i=0,pos=0; i<num_elems; i++) {
int elem_start_ind=(elem_ids[i]-1)*coordDim+1;
for (int j=0; j<coordDim; j++) {
cc_offsets[pos] = (PIO_Offset) (elem_start_ind+j);
pos++;
}
}

The issue is that cc_offsets is allocated on the stack. I expect there are some compiler shenanigans for stack relocation with such a large array that means that as soon as we try to access even element 0 of cc_offsets, we hit a segmentation violation.

I can confirm that with the following small patch to allocate this array on the heap, I'm able to create the Mesh and proceed with a regridding operation as expected.

diff --git a/src/Infrastructure/Mesh/src/ESMCI_ESMFMesh_Util.C b/src/Infrastructure/Mesh/src/ESMCI_ESMFMesh_Util.C
index 07102466ab..63271efe6f 100644
--- a/src/Infrastructure/Mesh/src/ESMCI_ESMFMesh_Util.C
+++ b/src/Infrastructure/Mesh/src/ESMCI_ESMFMesh_Util.C
@@ -1021,7 +1021,7 @@ void get_centerCoords_from_ESMFMesh_file(int pioSystemDesc, int pioFileDesc, cha
   if (piorc == PIO_NOERR) {

     // Define offsets for centerCoords decomp
-    PIO_Offset cc_offsets[num_elems*coordDim];
+    PIO_Offset *cc_offsets = new PIO_Offset[num_elems * coordDim];
     for (int i=0,pos=0; i<num_elems; i++) {
       int elem_start_ind=(elem_ids[i]-1)*coordDim+1;
       for (int j=0; j<coordDim; j++) {
@@ -1035,6 +1035,7 @@ void get_centerCoords_from_ESMFMesh_file(int pioSystemDesc, int pioFileDesc, cha
     int cc_gdimlen2D[2]={(int)elementCount,(int)coordDim};
     piorc = PIOc_InitDecomp_ReadOnly(pioSystemDesc, PIO_DOUBLE, 2, cc_gdimlen2D, num_elems*coordDim, cc_offsets, &cc_iodesc,
                             &rearr, NULL, NULL);
+    delete [] cc_offsets;
     if (!CHECKPIOERROR(piorc, std::string("Error initializing PIO decomp for centerCoords ") + filename,
                        ESMF_RC_FILE_OPEN, localrc)) throw localrc;;

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions