Skip to content

Conversation

@tomvothecoder
Copy link
Collaborator

@tomvothecoder tomvothecoder commented May 13, 2025

Description

This PR includes debugging scripts for validation xESMF regridding results against CDAT's regrid2. It constrains xcdat >=0.10.0.

Opened this issue in cf-xarray to get it fixed: xarray-contrib/cf-xarray#576

Just noting that we found a bug with xESMF, which is rooted in cf-xarray: pangeo-data/xESMF#433

We need to wait for a fix in cf-xarray to propagate up to xESMF. I just pinged a developer to see if they need help with a PR so we can get this fix into e3sm_diags.

Both of these GitHub issues are now resolved. Now we just need to confirm the fix propagates to e3sm_diags with cf_xarray=0.10.7 (through xcdat=0.10.0), done here.

EDIT: The maintainer dropped support for Python 3.10 in cf_xarray=0.10.7. This means only Python >=3.11 will get this fix.
We need to keep Python 3.10 support for E3SM Unified until we phase it out.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

If applicable:

  • New and existing unit tests pass with my changes (locally and CI/CD build)
  • I have added tests that prove my fix is effective or that my feature works
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

@tomvothecoder tomvothecoder marked this pull request as ready for review August 13, 2025 16:53
@tomvothecoder tomvothecoder marked this pull request as draft August 13, 2025 16:53
@tomvothecoder tomvothecoder self-assigned this Aug 27, 2025
@tomvothecoder tomvothecoder changed the title Investigate xESMF diff with bounds Validate xESMF regridding results with non-monotonic bounds Aug 27, 2025
@tomvothecoder
Copy link
Collaborator Author

tomvothecoder commented Aug 27, 2025

Test 1 -- Test script (xCDAT + xESMF with cf_xarray v0.10.7)

Issue - Statistics were different for non-monotonic bounds (first two rows)

Result - Issue is now resolved ✅

Statistical Comparison:
                              Min        Max      Mean           Sum       Std
asc lat, desc lat_bnds   0.164364  10.046874  1.398573  20139.449219  1.039437
desc lat, asc lat_bnds   0.164364  10.046874  1.398573  20139.449219  1.039437
asc lat, asc lat_bnds    0.164364  10.046874  1.398573  20139.449219  1.039437
desc lat, desc lat_bnds  0.164364  10.046874  1.398573  20139.449219  1.039437
desc lat, no lat_bnds    0.164364  10.046874  1.398573  20139.449219  1.039437
asc lat, no lat_bnds     0.164364  10.046874  1.398573  20139.449219  1.039437
Click to expand Python script
import numpy as np
import pandas as pd
import xarray as xr
import xcdat as xc
import xesmf as xe


def print_stats(*arrays, labels=None):
    if labels is None:
        labels = [f"Array {i + 1}" for i in range(len(arrays))]
    elif len(labels) != len(arrays):
        raise ValueError("Number of labels must match the number of arrays.")

    stats = {
        "Min": [np.min(arr) for arr in arrays],
        "Max": [np.max(arr) for arr in arrays],
        "Mean": [np.mean(arr) for arr in arrays],
        "Sum": [np.sum(arr) for arr in arrays],
        "Std": [np.std(arr) for arr in arrays],
    }

    df = pd.DataFrame(stats, index=labels)
    print("\nStatistical Comparison:")
    print(df)


# Load datasets
ds_a = xr.open_dataset(
    "/lcrc/group/e3sm/public_html/cdat-migration-fy24/25-02-14-branch-930-polar-after/polar/GPCP_v3.2/test.nc"
)
ds_b = xr.open_dataset(
    "/lcrc/group/e3sm/public_html/cdat-migration-fy24/25-02-14-branch-930-polar-after/polar/GPCP_v3.2/ref.nc"
)

# Regridding tests
output_grid_xesmf = ds_a.regridder.grid

# 1. Asc lat, Desc lat_bnds
# unaligned1 = regrid_with_xesmf(ds_b, output_grid_xesmf)
ds_b1 = ds_b.copy(deep=True)
unaligned1 = ds_b1.regridder.horizontal(
    "PRECT", output_grid_xesmf, tool="xesmf", method="conservative_normed"
)

# 2. Desc lat, Asc lat_bnds
ds_b2 = ds_b.copy(deep=True).sortby("lat", ascending=False)
ds_b2["lat_bnds"].values = np.sort(ds_b2["lat_bnds"], axis=-1)
unaligned2 = ds_b2.regridder.horizontal(
    "PRECT", output_grid_xesmf, tool="xesmf", method="conservative_normed"
)

# 3. Asc lat, Asc lat_bnds
ds_b3 = ds_b.copy(deep=True)
ds_b3["lat_bnds"].values = np.sort(ds_b3["lat_bnds"], axis=-1)
# aligned1 = regrid_with_xesmf(ds_b3, output_grid_xesmf)
aligned1 = ds_b3.regridder.horizontal(
    "PRECT", output_grid_xesmf, tool="xesmf", method="conservative_normed"
)

# 4. Desc lat, Desc lat_bnds
ds_b4 = ds_b.copy(deep=True).sortby("lat", ascending=False)
ds_b4["lat_bnds"].values = np.sort(ds_b4["lat_bnds"], axis=-1)[:, ::-1]
# aligned2 = regrid_with_xesmf(ds_b4, output_grid_xesmf)
aligned2 = ds_b4.regridder.horizontal(
    "PRECT", output_grid_xesmf, tool="xesmf", method="conservative_normed"
)

# 5. Desc lat, no lat_bnds
ds_b5 = ds_b.copy(deep=True).sortby("lat", ascending=False)
ds_b5 = ds_b5.drop("lat_bnds")
nobounds1 = ds_b5.regridder.horizontal(
    "PRECT", output_grid_xesmf, tool="xesmf", method="conservative_normed"
)

# 6. Asc lat, no lat_bnds
ds_b6 = ds_b.copy(deep=True).sortby("lat", ascending=True)
ds_b6 = ds_b6.drop("lat_bnds")
nobounds2 = ds_b6.regridder.horizontal(
    "PRECT", output_grid_xesmf, tool="xesmf", method="conservative_normed"
)

print_stats(
    unaligned1.PRECT.values,
    unaligned2.PRECT.values,
    aligned1.PRECT.values,
    aligned2.PRECT.values,
    nobounds1.PRECT.values,
    nobounds2.PRECT.values,
    labels=[
        "asc lat, desc lat_bnds",
        "desc lat, asc lat_bnds",
        "asc lat, asc lat_bnds",
        "desc lat, desc lat_bnds",
        "desc lat, no lat_bnds",
        "asc lat, no lat_bnds",
    ],
)

Test 2 -- Polar plot comparison with e3sm_diags (PRECT)

Issue - Abnormal data around the latitude edges of plot and different statics with e3sm_diags v3.0.0 + xESMF regridding

Result - Issue is now resolved ✅

With the latest dev environment on this branch, my comparison notebook shows that the model-obs plot for PRECT ANN polar_N is now fixed.

For convenience, I've pasted the plots below. Notice middle column v3.0.0 (before fix) has a different model-observation plot (third plot). There is no more abnormal data around the latitude edges for the first and third columns.

v2.12.1 (baseline) v3.0.0 (before fix) v3.0.0 (post-fix, this branch)
v2.12.1 (baseline) v3.0.0 (before fix) v3.0.0 (post-fix, this branch)

@tomvothecoder tomvothecoder marked this pull request as ready for review August 27, 2025 22:16
Copy link
Collaborator Author

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @chengzhuzhang, this PR is now ready for review. You'll probably want to do additional testing on top of my tests to ensure all polar plots are now fixed with the latest version of xcdat + xesmf + cf-xarray. Please refer to my comment above for more info.

@chengzhuzhang
Copy link
Contributor

@tomvothecoder the PR looks good to me. I'm pleased so see your contribution to cf_axarray that eventually resolved this problem. PS, this should also fix the issue #1004 with update xcdat to 0.10.0

@chengzhuzhang
Copy link
Contributor

The integration test failed due to an IndexError stems from cf_xarray/helpers.py.

@tomvothecoder
Copy link
Collaborator Author

tomvothecoder commented Sep 8, 2025

The issue from the integration test (below) is related to a bug the cf-xarray 0.10.7 and was addressed here xarray-contrib/cf-xarray#592 (me and the PR reviewer forgot the singleton case in my cf-xarray PR). cf-xarray 0.10.8 was released 4 days ago, and we'll need to wait for another cf-xarray release to make sure it is fixed. It should happen in the next week. If not, I will ping Deepak to get a timeline.

ERROR    e3sm_diags.parameter.core_parameter:core_parameter.py:353 Error in e3sm_diags.driver.meridional_mean_2d_driver
Traceback (most recent call last):
  File "/__w/e3sm_diags/e3sm_diags/e3sm_diags/parameter/core_parameter.py", line 350, in _run_diag
    single_result = module.run_diag(self)
                    ^^^^^^^^^^^^^^^^^^^^^
  File "/__w/e3sm_diags/e3sm_diags/e3sm_diags/driver/meridional_mean_2d_driver.py", line 87, in run_diag
    _run_diags_3d(parameter, ds_test, ds_ref, season, var_key, ref_name)
  File "/__w/e3sm_diags/e3sm_diags/e3sm_diags/driver/meridional_mean_2d_driver.py", line 113, in _run_diags_3d
    ds_t_plevs_rg_avg, ds_r_plevs_rg_avg = align_grids_to_lower_res(
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/e3sm_diags/e3sm_diags/e3sm_diags/driver/utils/regrid.py", line 395, in align_grids_to_lower_res
    ds_b_regrid = ds_b_new.regridder.horizontal(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/github/home/miniconda3/envs/e3sm_diags_ci/lib/python3.12/site-packages/xcdat/regridder/accessor.py", line 232, in horizontal
    output_ds = regridder.horizontal(data_var, self._ds)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/github/home/miniconda3/envs/e3sm_diags_ci/lib/python3.12/site-packages/xcdat/regridder/xesmf.py", line 180, in horizontal
    regridder = xe.Regridder(
                ^^^^^^^^^^^^^
  File "/github/home/miniconda3/envs/e3sm_diags_ci/lib/python3.12/site-packages/xesmf/frontend.py", line 917, in __init__
    grid_in, shape_in, input_dims = ds_to_ESMFgrid(
                                    ^^^^^^^^^^^^^^^
  File "/github/home/miniconda3/envs/e3sm_diags_ci/lib/python3.12/site-packages/xesmf/frontend.py", line 167, in ds_to_ESMFgrid
    lon_b, lat_b = _get_lon_lat_bounds(ds)
                   ^^^^^^^^^^^^^^^^^^^^^^^
  File "/github/home/miniconda3/envs/e3sm_diags_ci/lib/python3.12/site-packages/xesmf/frontend.py", line 110, in _get_lon_lat_bounds
    lat_b = cfxr.bounds_to_vertices(lat_bnds, ds.cf.get_bounds_dim_name('latitude'), order=None)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/github/home/miniconda3/envs/e3sm_diags_ci/lib/python3.12/site-packages/cf_xarray/helpers.py", line 182, in bounds_to_vertices
    core_dim_orders = _get_core_dim_orders(core_dim_coords)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/github/home/miniconda3/envs/e3sm_diags_ci/lib/python3.12/site-packages/cf_xarray/helpers.py", line 234, in _get_core_dim_orders
    elif isinstance(diffs[0], datetime.timedelta):
                    ~~~~~^^^
IndexError: index 0 is out of bounds for axis 0 with size 0

Current Paths Forward

  1. Pin cf_xarray <0.10.7 on main temporarily to ensure CI/CD passes
  2. Wait for cf_xarray=0.10.9 release with the fix -- probably in a week or so.

@chengzhuzhang
Copy link
Contributor

Thanks. Glad to know there is a recent fix. I think we can wait for the new cf_xarray release.

@tomvothecoder
Copy link
Collaborator Author

Hi @chengzhuzhang, cf_xarray v0.10.9 was released yesterday with the necessary fix to get the build passing.

I will merge this PR.

Things to note:

  • cf_xarray >=0.10.7 drops support for Python 3.10 -- this means the xESMF regridding fixes (this PR) will only apply to Python 3.11+

@tomvothecoder tomvothecoder merged commit 4306a4c into main Sep 10, 2025
10 of 18 checks passed
@tomvothecoder tomvothecoder deleted the bug/945-xesmf-bnds branch September 10, 2025 16:47
@chengzhuzhang
Copy link
Contributor

Things to note:

* cf_xarray >=0.10.7 drops support for Python 3.10 -- this means the xESMF regridding fixes (this PR) will only apply to Python 3.11+

Thanks! Another good reason to drop Python 3.10 and move to 3.11+. Tagging @xylar ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment