Skip to content

[Feature]: Investigate regridding implementation to ensure compatibility with MPI #988

@tomvothecoder

Description

@tomvothecoder

Is your feature request related to a problem?

We've ran into issues in the past where the MPI version of esmf on conda-forge is not compatible with the MPI installed on HPC environments (e.g., LCRC), resulting in MPI initialization errors.

The workaround with E3SM Unified is to constrain esmf to the nompi version. The limitation to this workaround is that it restricts users from using ESMF with MPI for faster regridding processes, while also limiting all packages in the environment to no MPI as well.

Related issues/PRs

Describe the solution you'd like

Here's a possible solution:

  1. Use subprocess to generate regridding weights with ESMF (e.g., ESMF_RegridWeightGen) -- This might avoid the MPI init issue
import subprocess

subprocess.run([
    "ESMF_RegridWeightGen",
    "--source", "source_grid.nc",
    "--destination", "target_grid.nc",
    "--weight", "weights_file.nc",
    "--method", "bilinear",
    "--ignore_unmapped"
])
  1. Use xESMF to apply the pre-computed weights.
regridder = xesmf.Regridder(ds, ds_out, weights="weights_file.nc", reuse_weights=True)

Describe alternatives you've considered

The short-term workaround is to constrain esmf to nompi.

Additional context

Should I use subprocess instead of xESMF (ChatGPT)?

Whether you should use a subprocess call to ESMF directly instead of xESMF depends on your use case. Here's a breakdown of the trade-offs and guidance:


✅ When to Use xESMF

xESMF is a high-level, Pythonic interface built on top of ESMF_RegridWeightGen, and it is the preferred choice in most workflows, especially when:

  • You're working with NetCDF, xarray, or climate/scientific data in Python.

  • You want to integrate regridding into a Python pipeline (e.g., with Dask, Pandas, SciPy).

  • You don’t require very fine-grained control over parallelism (or you’re okay with serial/nompi execution).

  • Your deployment environment is compatible with Python+MPI, or you’re fine using esmf-nompi.

Advantages:

  • Simple API (xesmf.Regridder)

  • Automatically generates weights and applies them.

  • Good for small- to medium-scale workflows.


🛠 When to Use ESMF Directly via subprocess

You might want to invoke ESMF_RegridWeightGen or ESMF_Regrid directly via subprocess when:

  1. You want full control over the ESMF command-line interface or output formats.

  2. You’re working in an environment where Python MPI support is broken, but native mpirun + ESMF binaries work.

  3. You need to process very large grids in parallel, and xESMF + esmf-nompi is too slow or runs out of memory.

  4. You want to precompute and store regridding weights for reuse across pipelines or applications.

  5. You want to avoid Conda or Python environment issues altogether.

Example subprocess call:

import subprocess

subprocess.run([
"ESMF_RegridWeightGen",
"--source", "source_grid.nc",
"--destination", "target_grid.nc",
"--weight", "weights_file.nc",
"--method", "bilinear",
"--ignore_unmapped"
])

Then apply the weights manually in Python, NCO, or another tool.


🔄 Hybrid Strategy (Best of Both Worlds)

A great approach can be:

  • Use subprocess to generate regridding weights with MPI-enabled ESMF_RegridWeightGen, using mpirun or srun as appropriate.

  • Use xESMF in Python to apply those precomputed weights using:

regridder = xesmf.Regridder(ds, ds_out, weights="weights_file.nc", reuse_weights=True)

This avoids the MPI_Init issues entirely but still lets you stay in Python for the rest of the workflow.


✅ Summary Recommendation

Use Case Recommended Tool
Small/medium workflows, simple Python use xESMF with esmf-nompi
Broken MPI in Python or large jobs on HPC subprocess with ESMF_RegridWeightGen
Precomputed weights for speed or reproducibility Hybrid (generate via subprocess, apply with xESMF)
End-to-end Python workflow without Slurm xESMF is easier and faster

Let me know your workflow context (e.g., grid sizes, job launcher, need for parallelism), and I can suggest a tailored setup.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions