Skip to content

Commit 5c5dfc3

Browse files
authored
Add ESGF links and more simulations to v1 data (#60)
* Add ESGF links for v1 data * Begin adding large ensemble * Further v1 data updates * Refactor get_esgf * Include large ensemble in docs * Clean up code
1 parent e0190d3 commit 5c5dfc3

File tree

8 files changed

+388
-186
lines changed

8 files changed

+388
-186
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,7 @@ conda-build/
116116

117117
*~
118118
utils/out.txt
119+
utils/out_*.txt
119120
utils/simulation_table.txt
120121
utils/reproduction_table.txt
121122
utils/test_reproduction_scripts.o*

docs/source/v1/WaterCycle/index.rst

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ refer to `Coupled E3SM v1 Model Overview <https://e3sm.org/?p=5470>`_ or to the
2121

2222
* *The DOE E3SM Coupled Model Version 1: Overview and Evaluation at Standard Resolution* `doi: 10.1029/2018MS001603 <https://doi.org/10.1029/2018MS001603>`_
2323
* *Description of historical and future projection simulations by the global coupled E3SMv1.0 model as used in CMIP6* `doi:10.5194/gmd-15-3941-2022 <https://doi.org/10.5194/gmd-15-3941-2022>`_
24+
* *Ensemble Spread Behavior in Coupled Climate Models: Insights From the Energy Exascale Earth System Model Version 1 Large Ensemble* `doi:10.1029/2023MS003653 <https://doi.org/10.1029/2023MS003653>`_
2425
* *The DOE E3SM Coupled Model Version 1: Description and Results at High Resolution* `doi:10.1029/2019MS001870 <https://doi.org/10.1029/2019MS001870>`_
2526

2627
Experiments:
@@ -42,18 +43,23 @@ For low-resolution:
4243

4344
* AMIP
4445

45-
* amip – atmosphere only AMIP simulation 1870-2014 (145 years), 3 ensemble members
46-
* amip_1850_allF – atmosphere only AMIP with all forcings held at 1850 values, 1870-2014 (145 years), 3 ensemble members
47-
* amip_1850_aeroF – atmosphere only AMIP with all aerosol forcings held at 1850 values, 1870-2014 (145 years), 3 ensemble members
46+
* amip – Atmosphere only AMIP simulation 1870-2014 (145 years), 3 ensemble members
47+
* amip_1850_allF – Atmosphere only AMIP with all forcings held at 1850 values, 1870-2014 (145 years), 3 ensemble members
48+
* amip_1850_aeroF – Atmosphere only AMIP with all aerosol forcings held at 1850 values, 1870-2014 (145 years), 3 ensemble members
4849

4950
* DAMIP
5051

51-
* damip_hist-GHG – greenhouse gases only, 3 ensemble members
52+
* damip_hist-GHG – Greenhouse gases only, 3 ensemble members
53+
54+
* LargeEnsemble
55+
56+
* historical - Historical simulations, 20 ensemble members
57+
* ssp370 - Future projection, 20 ensemble members
5258

5359
* Projection
5460

55-
* ssp5-8.5 – future projection, 5 ensemble members
56-
* damip_ssp5-8.5-GHG – future projection with greenhouse gases only, 3 ensemble members
61+
* ssp5-8.5 – Future projection, 5 ensemble members
62+
* damip_ssp5-8.5-GHG – Future projection with greenhouse gases only, 3 ensemble members
5763

5864
For high-resolution:
5965

@@ -72,7 +78,7 @@ For high-resolution:
7278
* F2010-CMIP6-HR – 3 ensemble members
7379
* F2010C5-CMIP6 – 5 ensemble members. "nudgeUV" refers to U and V wind directions.
7480
* F2010LRtunedHR – 4 ensemble members
75-
* A_WCYCLSSP585_CMIP6_HR – future projection, 1 ensemble member
81+
* A_WCYCLSSP585_CMIP6_HR – Future projection, 1 ensemble member
7682

7783

7884
.. toctree::

docs/source/v1/WaterCycle/simulation_data/simulation_table.rst

Lines changed: 211 additions & 129 deletions
Large diffs are not rendered by default.

utils/generate_html.bash

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
1-
pr_num=59
1+
pr_num=60
2+
try_num=11
23

34
# Chrysalis
45
#destination_dir=/lcrc/group/e3sm/public_html/diagnostic_output/$USER/data_docs_${pr_num}
56
#web_page="https://web.lcrc.anl.gov/public/e3sm/diagnostic_output/$USER/data_docs_${pr_num}/html/"
67
# Perlmutter
7-
destination_dir=/global/cfs/cdirs/e3sm/www/$USER/data_docs_${pr_num}
8-
web_page="https://portal.nersc.gov/cfs/e3sm/$USER/data_docs_${pr_num}/html/"
8+
destination_dir=/global/cfs/cdirs/e3sm/www/$USER/data_docs_${pr_num}_try${try_num}
9+
web_page="https://portal.nersc.gov/cfs/e3sm/$USER/data_docs_${pr_num}_try${try_num}/html/"
910

1011
python generate_tables.py
1112
if [ $? != 0 ]; then

utils/generate_tables.py

Lines changed: 82 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,24 @@
44
import requests
55
from collections import OrderedDict
66
from typing import Dict, List, Tuple
7+
import urllib.parse
78

89
# Functions to compute fields for simulations ###########################################
910
def get_data_size_and_hpss(hpss_path: str) -> Tuple[str, str]:
1011
"""Get the data size in TB"""
11-
output = "out.txt"
12+
is_symlink: bool = check_if_symlink(hpss_path)
13+
output = "out_du.txt"
1214
if os.path.exists(output):
1315
os.remove(output)
1416
try:
15-
os.system(f'(hsi "du {hpss_path}") 2>&1 | tee {output}')
17+
if is_symlink:
18+
# The `/*` expands symlinks on HSI!
19+
# This will actually work fine even if it's not a symlink,
20+
# but we needed to check for symlinks anyway to note "(symlink)" by the HPSS path,
21+
# so we might as well handle the cases separately here.
22+
os.system(f'(hsi "du {hpss_path}/*") 2>&1 | tee {output}')
23+
else:
24+
os.system(f'(hsi "du {hpss_path}") 2>&1 | tee {output}')
1625
except Exception as e:
1726
print(f"hsi failed: {e}")
1827
return ("", "")
@@ -31,37 +40,85 @@ def get_data_size_and_hpss(hpss_path: str) -> Tuple[str, str]:
3140
data_size = int(num_bytes)/1e12
3241
if data_size > 0:
3342
data_size = f"{data_size:.0f}"
34-
hpss = hpss_path
43+
if is_symlink:
44+
hpss = f"(symlink) {hpss_path}"
45+
else:
46+
hpss = hpss_path
3547
else:
3648
data_size = ""
3749
hpss = ""
3850
return (data_size, hpss)
3951

40-
def get_esgf(source_id: str, model_version: str, experiment: str, ensemble_num: str, link_type: str, node: str) -> str:
52+
def check_if_symlink(hpss_path: str) -> bool:
53+
output: str = "out_symlink_check.txt"
54+
if os.path.exists(output):
55+
os.remove(output)
56+
try:
57+
os.system(f'(hsi "ls {hpss_path}") 2>&1 | tee {output}')
58+
except Exception as e:
59+
print(f"hsi failed: {e}")
60+
return False
61+
with open(output, "r") as f:
62+
for line in f:
63+
# Symlinks on HSI/HPSS end in `@`
64+
match_object = re.search(f"{os.path.basename(hpss_path)}@", line)
65+
if match_object:
66+
return True
67+
return False
68+
69+
70+
def get_esgf(model_version: str, resolution: str, simulation_name: str, experiment: str, ensemble_num: str, link_type: str, node: str) -> str:
4171
esgf: str
4272
if link_type == "none":
4373
esgf = ""
44-
elif node == "cels.anl":
45-
esgf = f"`CMIP <https://esgf-node.{node}.gov/search/?project=CMIP6&activeFacets=%7B%22source_id%22%3A%22{source_id}%22%2C%22experiment_id%22%3A%22{experiment}%22%2C%22variant_label%22%3A%22r{ensemble_num}i1p1f1%22%7D>`_"
46-
elif experiment and ensemble_num:
47-
# See https://github.com/E3SM-Project/CMIP6-Metadata/pull/9#issuecomment-1246086256 for the table of ensemble numbers
48-
# Note that `[1:]`` removes `v` from `model_version`
49-
esgf_native: str = f"`Native <https://esgf-node.{node}.gov/search/e3sm/?model_version={model_version[1:]}_0&experiment={experiment}&ensemble_member=ens{ensemble_num}>`_"
50-
if experiment == 'hist-all-xGHG-xaer':
51-
experiment_id = 'hist-nat'
74+
elif model_version == "v1":
75+
v1_institution_id: str
76+
variant_suffix: str
77+
if simulation_name.startswith("LE_"):
78+
v1_institution_id = "UCSB"
79+
variant_suffix = "i2p2f1"
5280
else:
53-
experiment_id = experiment
54-
esgf_cmip: str = f"`CMIP <https://esgf-node.{node}.gov/search/cmip6/?source_id={source_id}&experiment_id={experiment_id}&variant_label=r{ensemble_num}i1p1f1>`_"
55-
if link_type == "cmip":
56-
esgf = esgf_cmip
57-
elif link_type == "native":
58-
esgf = esgf_native
59-
elif link_type == "both":
60-
esgf = esgf_cmip + ', ' + esgf_native
61-
else:
62-
raise ValueError(f"Invalid link_type={link_type}")
81+
v1_institution_id = "E3SM-Project"
82+
variant_suffix = "i1p1f1"
83+
human_readable_active_facets: str = f'{{"institution_id":"{v1_institution_id}","source_id":"E3SM-1-0","experiment_id":"{experiment}","variant_label":"r{ensemble_num}{variant_suffix}"}}'
84+
url_active_facets: str = urllib.parse.quote(human_readable_active_facets)
85+
esgf = f"`CMIP <https://esgf-node.{node}.gov/search?project=CMIP6&activeFacets={url_active_facets}>`_"
6386
else:
64-
esgf = ""
87+
# v2, v2.1
88+
# Determine source_id
89+
if (len(model_version) == 4) and (model_version[2] == "."):
90+
source_id = f"E3SM-{model_version[1]}-{model_version[3]}"
91+
elif (len (model_version) == 2):
92+
if resolution == "NARRM":
93+
source_id = f"E3SM-{model_version[1]}-0-{resolution}"
94+
else:
95+
source_id = f"E3SM-{model_version[1]}-0"
96+
else:
97+
raise RuntimeError(f"Invalid model-version={model_version}")
98+
# Determine esgf
99+
if node == "cels.anl": # v2.1 only
100+
human_readable_active_facets = f'{{"source_id":"{source_id}","experiment_id":"{experiment}","variant_label":"r{ensemble_num}i1p1f1"}}'
101+
url_active_facets: str = urllib.parse.quote(human_readable_active_facets)
102+
esgf = f"`CMIP <https://esgf-node.{node}.gov/search/?project=CMIP6&activeFacets={url_active_facets}>`_"
103+
elif experiment and ensemble_num:
104+
# See https://github.com/E3SM-Project/CMIP6-Metadata/pull/9#issuecomment-1246086256 for the table of ensemble numbers
105+
# Note that `[1:]`` removes `v` from `model_version`
106+
esgf_native: str = f"`Native <https://esgf-node.{node}.gov/search/e3sm/?model_version={model_version[1:]}_0&experiment={experiment}&ensemble_member=ens{ensemble_num}>`_"
107+
if experiment == 'hist-all-xGHG-xaer':
108+
experiment_id = 'hist-nat'
109+
else:
110+
experiment_id = experiment
111+
esgf_cmip: str = f"`CMIP <https://esgf-node.{node}.gov/search/cmip6/?source_id={source_id}&experiment_id={experiment_id}&variant_label=r{ensemble_num}i1p1f1>`_"
112+
if link_type == "cmip":
113+
esgf = esgf_cmip
114+
elif link_type == "native":
115+
esgf = esgf_native
116+
elif link_type == "both":
117+
esgf = esgf_cmip + ', ' + esgf_native
118+
else:
119+
raise ValueError(f"Invalid link_type={link_type}")
120+
else:
121+
esgf = ""
65122
return esgf
66123

67124
def get_run_script_original(model_version: str, simulation_name: str) -> str:
@@ -114,16 +171,7 @@ def __init__(self, simulation_dict):
114171
hpss_path = f"/home/projects/e3sm/www/{self.group}/E3SM{self.model_version}/{self.resolution}/{self.simulation_name}"
115172
self.data_size, self.hpss = get_data_size_and_hpss(hpss_path)
116173

117-
if (len(self.model_version) == 4) and (self.model_version[2] == "."):
118-
source_id = f"E3SM-{self.model_version[1]}-{self.model_version[3]}"
119-
elif (len (self.model_version) == 2):
120-
if self.resolution == "NARRM":
121-
source_id = f"E3SM-{self.model_version[1]}-0-{self.resolution}"
122-
else:
123-
source_id = f"E3SM-{self.model_version[1]}-0"
124-
else:
125-
raise RuntimeError(f"Invalid model-version={self.model_version}")
126-
self.esgf = get_esgf(source_id, self.model_version, self.experiment, self.ensemble_num, self.link_type, self.node)
174+
self.esgf = get_esgf(self.model_version, self.resolution, self.simulation_name, self.experiment, self.ensemble_num, self.link_type, self.node)
127175

128176
self.run_script_original = get_run_script_original(self.model_version, self.simulation_name)
129177
self.run_script_reproduction = get_run_script_reproduction(self.model_version, self.simulation_name)
@@ -286,7 +334,7 @@ def construct_pages(csv_file: str, model_version: str, group_name: str, include_
286334
resolutions,
287335
["Simulation", "Data Size (TB)", "ESGF Links", "HPSS Path"],
288336
f"../docs/source/{model_version}/{group_name}/simulation_data/simulation_table.rst",
289-
[85, 15, 400, 130]
337+
[85, 15, 400, 140]
290338
)
291339
if include_reproduction_scripts:
292340
generate_table(

utils/make_symlinks.bash

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# This will be a problem if these simulations are ever removed from the publication archives!
2+
for i in $(seq 1 20); do
3+
hsi ln -s /home/projects/e3sm/www/publication-archives/pub_archive_E3SM_1_0_LE_historical_ens$i /home/projects/e3sm/www/WaterCycle/E3SMv1/LR/LE_historical_ens$i
4+
done
5+
6+
for i in $(seq 1 20); do
7+
hsi ln -s /home/projects/e3sm/www/publication-archives/pub_archive_E3SM_1_0_LE_ssp370_ens$i /home/projects/e3sm/www/WaterCycle/E3SMv1/LR/LE_ssp370_ens$i
8+
done
9+
10+
# Symlink last remaining large simulation
11+
# This will be a problem if ndk ever deletes the source!
12+
hsi ln -s /home/n/ndk/2019/theta.20190910.branch_noCNT.n825def.unc06.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG /home/projects/e3sm/www/WaterCycle/E3SMv1/LR/theta.20190910.branch_noCNT.n825def.unc06.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG
13+
14+
# Note:
15+
# It seems impossible to do a recursive remove with HSI/on HPSS.
16+
# > rm -rf E3SM_1_0_LE_historical_ens1@ # Trying to remove mislabeled directory
17+
# Unknown option or missing argument: 'r' ignored
18+
# Unknown option or missing argument: 'f' ignored

utils/print_ensemble_rows.bash

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
for i in $(seq 1 20); do
2+
echo "v1, WaterCycle, LR, LargeEnsemble, LE_historical_ens$i, , , historical-large-ensemble, $i, none, ,"
3+
done
4+
5+
for i in $(seq 1 20); do
6+
echo "v1, WaterCycle, LR, LargeEnsemble, LE_ssp370_ens$i, , , ssp370-large-ensemble, $i, none, ,"
7+
done

0 commit comments

Comments
 (0)