Skip to content

Commit 6681e89

Browse files
committed
CDAT Migration Phase 2: Refactor core utilities and lat_lon set (#677)
Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class Regression testing for lat_lon variables `NET_FLUX_SRF` and `RESTOM` (#754) Update regression test notebook to show validation of all vars Add `subset_and_align_datasets()` to regrid.py (#776) Add template run scripts CDAT Migration Phase: Refactor `cosp_histogram` set (#748) - Refactor `cosp_histogram_driver.py` and `cosp_histogram_plot.py` - `formulas_cosp.py` (new file) - Includes refactored, Xarray-based `cosp_histogram_standard()` and `cosp_bin_sum()` functions - I wrote a lot of new code in `formulas_cosp.py` to clean up `derivations.py` and the old equivalent functions in `utils.py` - `derivations.py` - Cleaned up portions of `DERIVED_VARIABLES` dictionary - Removed unnecessary `OrderedDict` usage for `cosp_histogram` related variables (we should do this for the rest of the variables in in #716) - Remove unnecessary `convert_units()` function calls - Move cloud levels passed to derived variable formulas to `formulas_cosp.CLOUD_BIN_SUM_MAP` - `utils.py` - Delete deprecated, CDAT-based `cosp_histogram` functions - `dataset_xr.py` - Add `dataset_xr.Dataset._open_climo_dataset()` method with a catch for dataset quality issues where "time" is a scalar variable that does not match the "time" dimension array length, drops this variable and replaces it with the correct coordinate - Update `_get_dataset_with_derivation_func()` to handle derivation functions that require the `xr.Dataset` and `target_var_key` args (e.g., `cosp_histogram_standardize()` and `cosp_bin_sum()`) - `io.py` - Update `_write_vars_to_netcdf()` to write test, ref, and diff variables to individual netCDF (required for easy comparison to CDAT-based code that does the same thing) - Add `cdat_migration_regression_test_netcdf.ipynb` validation notebook template for comparing `.nc` files CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) Refactor 654 zonal mean xy (#752) Co-authored-by: Tom Vo <[email protected]> CDAT Migration - Update run script output directory to NERSC public webserver (#793) [PR]: CDAT Migration: Refactor `aerosol_aeronet` set (#788) CDAT Migration: Test `lat_lon` set with run script and debug any issues (#794) CDAT Migration: Refactor `polar` set (#749) Co-authored-by: Tom Vo <[email protected]> Align order of calls to `_set_param_output_attrs` CDAT Migration: Refactor `meridional_mean_2d` set (#795) CDAT Migration: Refactor `aerosol_budget` (#800) Add `acme.py` changes from PR #712 (#814) * Add `acme.py` changes from PR #712 * Replace unnecessary lambda call Refactor area_mean_time_series and add ccb slice flag feature (#750) Co-authored-by: Tom Vo <[email protected]> [Refactor]: Validate fix in PR #750 for #759 (#815) CDAT Migration Phase 2: Refactor `diurnal_cycle` set (#819) CDAT Migration: Refactor annual_cycle_zonal_mean set (#798) * Refactor `annual_cycle_zonal_mean` set * Address PR review comments * Add lat lon regression testing * Add debugging scripts * Update `_open_climo_dataset()` to decode times as workaround to misaligned time coords - Update `annual_cycle_zonal_mean_plot.py` to convert time coordinates to month integers * Fix unit tests * Remove old plotter * Add script to debug decode_times=True and ncclimo file * Update plotter time values to month integers * Fix slow `.load()` and multiprocessing issue - Due to incorrectly updating `keep_bnds` logic - Add `_encode_time_coords()` to workaround cftime issue `ValueError: "months since" units only allowed for "360_day" calendar` * Update `_encode_time_coords()` docstring * Add AODVIS debug script * update AODVIS obs datasets; regression test results --------- Co-authored-by: Tom Vo <[email protected]> CDAT Migration Phase 2: Refactor `qbo` set (#826) CDAT Migration Phase 2: Refactor tc_analysis set (#829) * start tc_analysis_refactor * update driver * update plotting * Clean up plotter - Remove unused variables - Make `plot_info` a constant called `PLOT_INFO`, which is now a dict of dicts - Reorder functions for top-down readability * Remove unused notebook --------- Co-authored-by: tomvothecoder <[email protected]> CDAT Migration Phase 2: Refactor `enso_diags` set (#832) CDAT Migration Phase 2: Refactor `streamflow` set (#837) [Bug]: CDAT Migration Phase 2: enso_diags plot fixes (#841) [Refactor]: CDAT Migration Phase 3: testing and documentation update (#846) CDAT Migration Phase 3 - Port QBO Wavelet feature to Xarray/xCDAT codebase (#860) CDAT Migration Phase 2: Refactor arm_diags set (#842) Add performance benchmark material (#864) Add function to add CF axis attr to Z axis if missing for downstream xCDAT operations (#865) CDAT Migration Phase 3: Add Convective Precipitation Fraction in lat-lon (#875) CDAT Migration Phase 3: Fix LHFLX name and add catch for non-existent or empty TE stitch file (#876) Add support for time series datasets via glob and fix `enso_diags` set (#866) Add fix for checking `is_time_series()` property based on `data_type` attr (#881) CDAT migration: Fix African easterly wave density plots in TC analysis and convert H20LNZ units to ppm/volume (#882) CDAT Migration: Update `mp_partition_driver.py` to use Dataset from `dataset_xr.py` (#883) CDAT Migration - Port JJB tropical subseasonal diags to Xarray/xCDAT (#887) CDAT Migration: Prepare branch for merge to `main` (#885) [Refactor]: CDAT Migration - Update dependencies and remove Dataset._add_cf_attrs_to_z_axes() (#891) CDAT Migration Phase 2: Refactor core utilities and `lat_lon` set (#677) Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class CDAT Migration Phase 2: Refactor `zonal_mean_2d()` and `zonal_mean_2d_stratosphere()` sets (#774) CDAT Migration Phase 2: Refactor `qbo` set (#826)
1 parent 3f5b036 commit 6681e89

File tree

298 files changed

+100024
-13006
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

298 files changed

+100024
-13006
lines changed

.coveragerc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[report]
2+
exclude_also =
3+
if TYPE_CHECKING:

.github/workflows/build_workflow.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ on:
55
branches: [main]
66

77
pull_request:
8-
branches: [main]
8+
branches: [main, cdat-migration-fy24]
99

1010
workflow_dispatch:
1111

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,7 @@ ENV/
110110

111111
# NetCDF files needed
112112
!e3sm_diags/driver/acme_ne30_ocean_land_mask.nc
113+
!auxiliary_tools/cdat_regression_testing/759-slice-flag/debug/*.nc
113114

114115
# Folder for storing quality assurance files and notes
115116
qa/

.pre-commit-config.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,4 +34,5 @@ repos:
3434
hooks:
3535
- id: mypy
3636
args: [--config=pyproject.toml]
37-
additional_dependencies: [dask, numpy>=1.23.0, types-PyYAML]
37+
additional_dependencies:
38+
[dask, numpy>=1.23.0, xarray>=2023.3.0, types-PyYAML]

.vscode/e3sm_diags.code-workspace

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@
5858
"configurations": [
5959
{
6060
"name": "Python: Current File",
61-
"type": "python",
61+
"type": "debugpy",
6262
"request": "launch",
6363
"program": "${file}",
6464
"console": "integratedTerminal",
File renamed without changes.

auxiliary_tools/aerosol_budget.py

Lines changed: 67 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
# NOTE: This module uses the deprecated e3sm_diags.driver.utils.dataset.Dataset
2+
# class, which was replaced by e3sm_diags.driver.utils.dataset_xr.Dataset.
13
import e3sm_diags
24
from e3sm_diags.driver import utils
35
import cdms2
@@ -12,11 +14,12 @@
1214

1315

1416
def global_integral(var, area_m2):
15-
""" Compute global integral of 2 dimentional properties"""
16-
return numpy.sum(numpy.sum(abs(var)*area_m2,axis = 0), axis=0)
17+
"""Compute global integral of 2 dimentional properties"""
18+
return numpy.sum(numpy.sum(abs(var) * area_m2, axis=0), axis=0)
19+
1720

1821
def calc_column_integral(data, aerosol, season):
19-
""" Calculate column integrated mass """
22+
"""Calculate column integrated mass"""
2023

2124
# take aerosol and change it to the appropriate string
2225
# ncl -> SEASALT, dst -> DUST, rest1 -> REST1
@@ -32,129 +35,129 @@ def calc_column_integral(data, aerosol, season):
3235
burden = data.get_climo_variable(f"ABURDEN{aerosol_name}", season)
3336
except RuntimeError:
3437
# if not, use the Mass_ terms and integrate over the column
35-
mass = data.get_climo_variable(f'Mass_{aerosol}', season)
38+
mass = data.get_climo_variable(f"Mass_{aerosol}", season)
3639
hyai, hybi, ps = data.get_extra_variables_only(
37-
f'Mass_{aerosol}', season, extra_vars=["hyai", "hybi", "PS"]
40+
f"Mass_{aerosol}", season, extra_vars=["hyai", "hybi", "PS"]
3841
)
3942

4043
p0 = 100000.0 # Pa
41-
ps = ps # Pa
42-
pressure_levs = cdutil.vertical.reconstructPressureFromHybrid(ps, hyai, hybi, p0)
44+
ps = ps # Pa
45+
pressure_levs = cdutil.vertical.reconstructPressureFromHybrid(
46+
ps, hyai, hybi, p0
47+
)
4348

44-
#(72,lat,lon)
45-
delta_p = numpy.diff(pressure_levs,axis = 0)
46-
mass_3d = mass*delta_p/9.8 #mass density * mass air kg/m2
47-
burden = numpy.nansum(mass_3d,axis = 0) #kg/m2
49+
# (72,lat,lon)
50+
delta_p = numpy.diff(pressure_levs, axis=0)
51+
mass_3d = mass * delta_p / 9.8 # mass density * mass air kg/m2
52+
burden = numpy.nansum(mass_3d, axis=0) # kg/m2
4853
return burden
49-
54+
55+
5056
def generate_metrics_dic(data, aerosol, season):
5157
metrics_dict = {}
52-
wetdep = data.get_climo_variable(f'{aerosol}_SFWET', season)
53-
drydep = data.get_climo_variable(f'{aerosol}_DDF', season)
54-
srfemis = data.get_climo_variable(f'SF{aerosol}', season)
55-
area = data.get_extra_variables_only(
56-
f'{aerosol}_DDF', season, extra_vars=["area"]
57-
)
58+
wetdep = data.get_climo_variable(f"{aerosol}_SFWET", season)
59+
drydep = data.get_climo_variable(f"{aerosol}_DDF", season)
60+
srfemis = data.get_climo_variable(f"SF{aerosol}", season)
61+
area = data.get_extra_variables_only(f"{aerosol}_DDF", season, extra_vars=["area"])
5862
area_m2 = area * REARTH**2
5963

6064
burden = calc_column_integral(data, aerosol, season)
61-
burden_total= global_integral(burden, area_m2)*1e-9 # kg to Tg
62-
print(f'{aerosol} Burden (Tg): ',f'{burden_total:.3f}')
63-
sink = global_integral((drydep-wetdep),area_m2)*UNITS_CONV
64-
drydep = global_integral(drydep,area_m2)*UNITS_CONV
65-
wetdep = global_integral(wetdep,area_m2)*UNITS_CONV
66-
srfemis = global_integral(srfemis,area_m2)*UNITS_CONV
67-
print(f'{aerosol} Sink (Tg/year): ',f'{sink:.3f}')
68-
print(f'{aerosol} Lifetime (days): ',f'{burden_total/sink*365:.3f}')
65+
burden_total = global_integral(burden, area_m2) * 1e-9 # kg to Tg
66+
print(f"{aerosol} Burden (Tg): ", f"{burden_total:.3f}")
67+
sink = global_integral((drydep - wetdep), area_m2) * UNITS_CONV
68+
drydep = global_integral(drydep, area_m2) * UNITS_CONV
69+
wetdep = global_integral(wetdep, area_m2) * UNITS_CONV
70+
srfemis = global_integral(srfemis, area_m2) * UNITS_CONV
71+
print(f"{aerosol} Sink (Tg/year): ", f"{sink:.3f}")
72+
print(f"{aerosol} Lifetime (days): ", f"{burden_total/sink*365:.3f}")
6973
metrics_dict = {
70-
"Surface Emission (Tg/yr)": f'{srfemis:.3f}',
71-
"Sink (Tg/yr)": f'{sink:.3f}',
72-
"Dry Deposition (Tg/yr)": f'{drydep:.3f}',
73-
"Wet Deposition (Tg/yr)": f'{wetdep:.3f}',
74-
"Burden (Tg)": f'{burden_total:.3f}',
75-
"Lifetime (Days)": f'{burden_total/sink*365:.3f}',
74+
"Surface Emission (Tg/yr)": f"{srfemis:.3f}",
75+
"Sink (Tg/yr)": f"{sink:.3f}",
76+
"Dry Deposition (Tg/yr)": f"{drydep:.3f}",
77+
"Wet Deposition (Tg/yr)": f"{wetdep:.3f}",
78+
"Burden (Tg)": f"{burden_total:.3f}",
79+
"Lifetime (Days)": f"{burden_total/sink*365:.3f}",
7680
}
7781
return metrics_dict
7882

83+
7984
param = CoreParameter()
80-
param.test_name = 'v2.LR.historical_0101'
81-
param.test_name = 'F2010.PD.NGD_v3atm.0096484.compy'
82-
param.test_data_path = '/Users/zhang40/Documents/ACME_simulations/'
83-
param.test_data_path = '/compyfs/mahf708/E3SMv3_dev/F2010.PD.NGD_v3atm.0096484.compy/post/atm/180x360_aave/clim/10yr'
85+
param.test_name = "v2.LR.historical_0101"
86+
param.test_name = "F2010.PD.NGD_v3atm.0096484.compy"
87+
param.test_data_path = "/Users/zhang40/Documents/ACME_simulations/"
88+
param.test_data_path = "/compyfs/mahf708/E3SMv3_dev/F2010.PD.NGD_v3atm.0096484.compy/post/atm/180x360_aave/clim/10yr"
8489
test_data = utils.dataset.Dataset(param, test=True)
8590

86-
#rearth = 6.37122e6 #km
87-
#UNITS_CONV = 86400.0*365.0*1e-9 # kg/s to Tg/yr
88-
REARTH = 6.37122e6 #km
89-
UNITS_CONV = 86400.0*365.0*1e-9 # kg/s to Tg/yr
91+
# rearth = 6.37122e6 #km
92+
# UNITS_CONV = 86400.0*365.0*1e-9 # kg/s to Tg/yr
93+
REARTH = 6.37122e6 # km
94+
UNITS_CONV = 86400.0 * 365.0 * 1e-9 # kg/s to Tg/yr
9095
# TODO:
9196
# Convert so4 unit to TgS
92-
#mwso4 = 115.0
93-
#mws = 32.066
94-
#UNITS_CONV_S = UNITS_CONV/mwso4*mws # kg/s to TgS/yr
97+
# mwso4 = 115.0
98+
# mws = 32.066
99+
# UNITS_CONV_S = UNITS_CONV/mwso4*mws # kg/s to TgS/yr
95100

96101

97-
species = ["bc", "dst", "mom", "ncl","pom","so4","soa"]
98-
SPECIES_NAMES = {"bc": "Black Carbon",
102+
species = ["bc", "dst", "mom", "ncl", "pom", "so4", "soa"]
103+
SPECIES_NAMES = {
104+
"bc": "Black Carbon",
99105
"dst": "Dust",
100106
"mom": "Marine Organic Matter",
101107
"ncl": "Sea Salt",
102108
"pom": "Primary Organic Matter",
103109
"so4": "Sulfate",
104-
"soa": "Secondary Organic Aerosol"}
110+
"soa": "Secondary Organic Aerosol",
111+
}
105112
MISSING_VALUE = 999.999
106113
metrics_dict = {}
107114
metrics_dict_ref = {}
108115

109116
seasons = ["ANN"]
110117

111118
ref_data_path = os.path.join(
112-
e3sm_diags.INSTALL_PATH,
113-
"control_runs",
114-
"aerosol_global_metrics_benchmarks.json",
115-
)
119+
e3sm_diags.INSTALL_PATH,
120+
"control_runs",
121+
"aerosol_global_metrics_benchmarks.json",
122+
)
116123

117-
with open(ref_data_path, 'r') as myfile:
118-
ref_file=myfile.read()
124+
with open(ref_data_path, "r") as myfile:
125+
ref_file = myfile.read()
119126

120127
metrics_ref = json.loads(ref_file)
121128

122129
for season in seasons:
123130
for aerosol in species:
124-
print(f'Aerosol species: {aerosol}')
131+
print(f"Aerosol species: {aerosol}")
125132
metrics_dict[aerosol] = generate_metrics_dic(test_data, aerosol, season)
126133
metrics_dict_ref[aerosol] = metrics_ref[aerosol]
127-
#metrics_dict_ref[aerosol] = {
134+
# metrics_dict_ref[aerosol] = {
128135
# "Surface Emission (Tg/yr)": f'{MISSING_VALUE:.3f}',
129136
# "Sink (Tg/yr)": f'{MISSING_VALUE:.3f}',
130137
# "Dry Deposition (Tg/yr)": f'{MISSING_VALUE:.3f}',
131138
# "Wet Deposition (Tg/yr)": f'{MISSING_VALUE:.3f}',
132139
# "Burden (Tg)": f'{MISSING_VALUE:.3f}',
133140
# "Lifetime (Days)": f'{MISSING_VALUE:.3f}',
134141
# }
135-
136-
with open(f'aerosol_table_{season}.csv', "w") as table_csv:
142+
143+
with open(f"aerosol_table_{season}.csv", "w") as table_csv:
137144
writer = csv.writer(
138145
table_csv,
139146
delimiter=",",
140147
quotechar="'",
141148
quoting=csv.QUOTE_MINIMAL,
142-
lineterminator='\n',
149+
lineterminator="\n",
143150
)
144-
#writer.writerow([" ", "test","ref",])
151+
# writer.writerow([" ", "test","ref",])
145152
for key, values in metrics_dict.items():
146153
writer.writerow([SPECIES_NAMES[key]])
147-
print('key',key, values)
154+
print("key", key, values)
148155
for value in values:
149156
print(value)
150157
line = []
151158
line.append(value)
152159
line.append(values[value])
153160
line.append(metrics_dict_ref[key][value])
154-
print(line, 'line')
161+
print(line, "line")
155162
writer.writerows([line])
156163
writer.writerows([""])
157-
158-
159-
160-

0 commit comments

Comments
 (0)