Skip to content

Commit 0fa3946

Browse files
committed
Updates on Perlmutter
1 parent c4529a8 commit 0fa3946

File tree

8 files changed

+282
-85
lines changed

8 files changed

+282
-85
lines changed

docs/source/v1/WaterCycle/reproducing_simulations/index.rst

Lines changed: 0 additions & 9 deletions
This file was deleted.

docs/source/v1/WaterCycle/reproducing_simulations/reproduction_table.rst

Whitespace-only changes.

docs/source/v1/WaterCycle/simulation_data/index.rst

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,59 @@
22
Simulation Data
33
***************
44

5+
The E3SMv1 simulation data is available on **ESGF** and **NERSC HPSS**.
6+
7+
The preferred retrieval method is **ESGF**. Native output is available at `ESGF <https://esgf-node.llnl.gov/search/e3sm/?model_version=1_0>`_, and a subset of the data is reformatted to conform to CMIP conventions and submmited to the CMIP6 ESGF archive (ESGF links are provided in the table below for published data).
8+
9+
Additionally, all native model output data has also been archived on **NERSC HPSS** using `zstash <https://e3sm-project.github.io/zstash>`_.
10+
11+
**If you have an account on NERSC**, you can retrieve the data locally or remotely using Globus.
12+
13+
To download simulation data locally on a NERSC machine: ::
14+
15+
zstash extract --hpss=<HPSS path in table>
16+
17+
To download simulation data remotely using the zstash Globus interface: ::
18+
19+
zstash extract --hpss=globus://nersc/<HPSS path in table>
20+
21+
or ::
22+
23+
zstash extract --hpss=globus://9cd89cfd-6d04-11e5-ba46-22000b92c6ec/<HPSS path below>
24+
25+
Note that the data management tool `zstash <https://github.com/E3SM-Project/zstash>`_ is available from the `E3SM-Unified <https://github.com/E3SM-Project/e3sm-unified>`_ conda environment. An example of retrieving all **eam.h0** (monthly atmosphere output files) between **years 0030 and 0049** for the v2.LR.piControl simulation at NERSC locally is demonstrated as below in two steps:
26+
27+
1. To activate E3SM-Unified environment by:
28+
::
29+
30+
source /global/common/software/e3sm/anaconda_envs/load_latest_e3sm_unified_pm-cpu.sh
31+
32+
2. To retrieve files with zstash command:
33+
::
34+
35+
zstash extract --hpss=/home/projects/e3sm/www/WaterCycle/E3SMv1/LR/20180129.DECKv1b_piControl.ne30_oEC.edison "*.elm.h0.00[3-4]?-??.nc"
36+
37+
38+
For more information, refer to `zstash usage <https://e3sm-project.github.io/zstash/_build/html/master/usage.html#extract>`_.
39+
40+
41+
**If you do not have access to NERSC**, you can download simulation data directly through the NERSC HPSS
42+
`web interface <https://portal.nersc.gov/archive/home/projects/e3sm/www/WaterCycle/E3SMv1>`_.
43+
Note that this will be slow and inefficient since you'll have to download the tar files.
44+
45+
**v1.LR** simulations data has been archived on NERSC HPSS under: ::
46+
47+
/home/projects/e3sm/www/WaterCycle/E3SMv1/LR
48+
49+
and **v1.HR** simulations data under: ::
50+
51+
/home/projects/e3sm/www/WaterCycle/E3SMv1/HR
52+
53+
54+
Scripts are not available to reproduce v1 simulations.
55+
56+
Original run scripts (the scripts that were originally used to create the simulations) have been archived here `here <https://github.com/E3SM-Project/e3sm_data_docs/tree/main/run_scripts/v1/original/>`_. These latter scripts are provided for reference only.
57+
558
.. toctree::
659
:maxdepth: 2
760
:caption: Contents:

docs/source/v1/WaterCycle/simulation_data/simulation_table.rst

Lines changed: 111 additions & 0 deletions
Large diffs are not rendered by default.

docs/source/v2/WaterCycle/simulation_data/index.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,11 @@ Additionally, all native model output data has also been archived on **NERSC HPS
1212

1313
To download simulation data locally on a NERSC machine: ::
1414

15-
zstash extract --hpss=<HPSS path below>
15+
zstash extract --hpss=<HPSS path in table>
1616

1717
To download simulation data remotely using the zstash Globus interface: ::
1818

19-
zstash extract --hpss=globus://nersc/<HPSS path below>
19+
zstash extract --hpss=globus://nersc/<HPSS path in table>
2020

2121
or ::
2222

@@ -53,7 +53,7 @@ and **v2.NARRM** simulations data under: ::
5353

5454
Scripts to reproduce v2 simulations are available `here <https://github.com/E3SM-Project/e3sm_data_docs/tree/main/run_scripts/v2/reproduce/>`_
5555
with specific instructions details in `Reproducing Simulations`.
56-
Original run scripts (the scripts that were originally used to create the simulations) have been archived here `here <https://github.com/E3SM-Project/e3sm_data_docs/tree/main/run_scripts/v2/original/>`_. These latter srcipts are provided for reference only.
56+
Original run scripts (the scripts that were originally used to create the simulations) have been archived here `here <https://github.com/E3SM-Project/e3sm_data_docs/tree/main/run_scripts/v2/original/>`_. These latter scripts are provided for reference only.
5757

5858
.. toctree::
5959
:maxdepth: 2

utils/generate_html.bash

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
pr_num=56
1+
pr_num=59
22

33
# Chrysalis
44
#destination_dir=/lcrc/group/e3sm/public_html/diagnostic_output/$USER/data_docs_${pr_num}
@@ -8,6 +8,9 @@ destination_dir=/global/cfs/cdirs/e3sm/www/$USER/data_docs_${pr_num}
88
web_page="https://portal.nersc.gov/cfs/e3sm/$USER/data_docs_${pr_num}/html/"
99

1010
python generate_tables.py
11+
if [ $? != 0 ]; then
12+
exit 1
13+
fi
1114
cd ../docs/ && make html
1215
rm -rf ls ${destination_dir}
1316
mv _build ${destination_dir}

utils/generate_tables.py

Lines changed: 59 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,7 @@
33
import re
44
import requests
55
from collections import OrderedDict
6-
from typing import List, Tuple
7-
8-
# Make edits on Perlmutter so HPSS is available.
9-
# TODO: update this file!!!
10-
# TODO: update HPSS paths in v1 csv
6+
from typing import Dict, List, Tuple
117

128
# Functions to compute fields for simulations ###########################################
139
def get_data_size_and_hpss(hpss_path: str) -> Tuple[str, str]:
@@ -41,22 +37,25 @@ def get_data_size_and_hpss(hpss_path: str) -> Tuple[str, str]:
4137
hpss = ""
4238
return (data_size, hpss)
4339

44-
def get_esgf(source_id: str, model_version: str, experiment: str, ensemble_num: str, cmip_only: str, node: str) -> str:
40+
def get_esgf(source_id: str, model_version: str, experiment: str, ensemble_num: str, link_type: str, node: str) -> str:
41+
esgf: str
4542
if node == "cels.anl":
4643
esgf = f"`CMIP <https://esgf-node.{node}.gov/search/?project=CMIP6&activeFacets=%7B%22source_id%22%3A%22{source_id}%22%2C%22experiment_id%22%3A%22{experiment}%22%2C%22variant_label%22%3A%22r{ensemble_num}i1p1f1%22%7D>`_"
4744
elif experiment and ensemble_num:
4845
# See https://github.com/E3SM-Project/CMIP6-Metadata/pull/9#issuecomment-1246086256 for the table of ensemble numbers
49-
# remove v from model_version
50-
esgf = f"`Native <https://esgf-node.{node}.gov/search/e3sm/?model_version={model_version[1:]}_0&experiment={experiment}&ensemble_member=ens{ensemble_num}>`_"
46+
# Note that `[1:]`` removes `v` from `model_version`
47+
esgf_native: str = f"`Native <https://esgf-node.{node}.gov/search/e3sm/?model_version={model_version[1:]}_0&experiment={experiment}&ensemble_member=ens{ensemble_num}>`_"
5148
if experiment == 'hist-all-xGHG-xaer':
5249
experiment_id = 'hist-nat'
5350
else:
5451
experiment_id = experiment
55-
esgf_cmip = f"`CMIP <https://esgf-node.{node}.gov/search/cmip6/?source_id={source_id}&experiment_id={experiment_id}&variant_label=r{ensemble_num}i1p1f1>`_"
56-
if cmip_only:
52+
esgf_cmip: str = f"`CMIP <https://esgf-node.{node}.gov/search/cmip6/?source_id={source_id}&experiment_id={experiment_id}&variant_label=r{ensemble_num}i1p1f1>`_"
53+
if link_type == "cmip":
5754
esgf = esgf_cmip
55+
elif link_type == "native":
56+
esgf = esgf_native
5857
else:
59-
esgf = esgf_cmip + ', ' + esgf
58+
esgf = esgf_cmip + ', ' + esgf_native
6059
else:
6160
esgf = ""
6261
return esgf
@@ -92,7 +91,13 @@ def __init__(self, simulation_dict):
9291
self.experiment = simulation_dict["experiment"]
9392

9493
self.ensemble_num = simulation_dict["ensemble_num"]
95-
self.cmip_only = simulation_dict["cmip_only"]
94+
self.link_type = simulation_dict["link_type"]
95+
96+
if "hpss_path" in simulation_dict:
97+
# If `hpss_path` is specified, then it's a non-standard path
98+
hpss_path = simulation_dict["hpss_path"]
99+
else:
100+
hpss_path = f"/home/projects/e3sm/www/{self.group}/E3SM{self.model_version}/{self.resolution}/{self.simulation_name}"
96101
if "node" in simulation_dict.keys():
97102
self.node = simulation_dict["node"]
98103
else:
@@ -114,7 +119,7 @@ def __init__(self, simulation_dict):
114119
source_id = f"E3SM-{self.model_version[1]}-0"
115120
else:
116121
raise RuntimeError(f"Invalid model-version={self.model_version}")
117-
self.esgf = get_esgf(source_id, self.model_version, self.experiment, self.ensemble_num, self.cmip_only, self.node)
122+
self.esgf = get_esgf(source_id, self.model_version, self.experiment, self.ensemble_num, self.link_type, self.node)
118123

119124
self.run_script_original = get_run_script_original(self.model_version, self.simulation_name)
120125
self.run_script_reproduction = get_run_script_reproduction(self.model_version, self.simulation_name)
@@ -167,12 +172,15 @@ def append(self, group):
167172
self.groups.update([(group.name, group)])
168173

169174
# Construct simulations ###########################################
175+
170176
def read_simulations(csv_file):
171177
# model_version > group > resolution > category > simulation_name,
172178
versions: OrderedDict[str: ModelVersion] = OrderedDict()
173179
with open(csv_file, newline='') as opened_file:
174180
reader = csv.reader(opened_file)
175181
header: List[str] = []
182+
simulation_dicts: List[Dict[str, str]] = []
183+
# First, just set up the dictionary, to make sure all the necessary data is available.
176184
for row in reader:
177185
# Get labels
178186
if header == []:
@@ -183,40 +191,48 @@ def read_simulations(csv_file):
183191
for i in range(len(header)):
184192
label = header[i]
185193
if len(row) != len(header):
186-
raise RuntimeError(f"header has {len(header)} labels, but row has {len(row)} entries")
194+
raise RuntimeError(f"header has {len(header)} labels, but row={row} has {len(row)} entries")
187195
simulation_dict[label] = row[i].strip()
188-
model_version_name = simulation_dict["model_version"]
189-
group_name = simulation_dict["group"]
190-
resolution_name = simulation_dict["resolution"]
191-
category_name = simulation_dict["category"]
192-
if model_version_name not in versions:
193-
v = ModelVersion(model_version_name)
194-
versions.update([(model_version_name, v)])
195-
else:
196-
v = versions[model_version_name]
197-
if group_name not in v.groups:
198-
g = Group(group_name)
199-
v.groups.update([(group_name, g)])
200-
else:
201-
g = v.groups[group_name]
202-
if resolution_name not in g.resolutions:
203-
r = Resolution(resolution_name)
204-
g.resolutions.update([(resolution_name, r)])
205-
else:
206-
r = g.resolutions[resolution_name]
207-
if category_name not in r.categories:
208-
c = Category(category_name)
209-
r.categories.update([(category_name, c)])
210-
else:
211-
c = r.categories[category_name]
212-
s = Simulation(simulation_dict)
213-
c.simulations.update([(s.simulation_name, s)])
196+
if "cmip_only" in simulation_dict:
197+
simulation_dict["link_type"] = "cmip"
198+
simulation_dicts.append(simulation_dict)
199+
# Now, that we have valid dictionaries for each simulation, let's construct objects
200+
for simulation_dict in simulation_dicts:
201+
model_version_name = simulation_dict["model_version"]
202+
group_name = simulation_dict["group"]
203+
resolution_name = simulation_dict["resolution"]
204+
category_name = simulation_dict["category"]
205+
if model_version_name not in versions:
206+
v = ModelVersion(model_version_name)
207+
versions.update([(model_version_name, v)])
208+
else:
209+
v = versions[model_version_name]
210+
if group_name not in v.groups:
211+
g = Group(group_name)
212+
v.groups.update([(group_name, g)])
213+
else:
214+
g = v.groups[group_name]
215+
if resolution_name not in g.resolutions:
216+
r = Resolution(resolution_name)
217+
g.resolutions.update([(resolution_name, r)])
218+
else:
219+
r = g.resolutions[resolution_name]
220+
if category_name not in r.categories:
221+
c = Category(category_name)
222+
r.categories.update([(category_name, c)])
223+
else:
224+
c = r.categories[category_name]
225+
s = Simulation(simulation_dict)
226+
c.simulations.update([(s.simulation_name, s)])
214227
return versions
215228

216229
# Construct table display of simulations ###########################################
217230
def pad_cells(cells: List[str], col_divider: str, cell_paddings: List[int]) -> str:
218231
string = col_divider
219232
for i in range(len(cells)):
233+
if len(cells[i]) > cell_paddings[i]:
234+
s = f"WARNING: cell padding={cell_paddings[i]} is insufficient for {cells[i]} of length {len(cells[i])}"
235+
raise RuntimeError(s)
220236
string += " " + cells[i].ljust(cell_paddings[i] + 1) + col_divider
221237
string += "\n"
222238
return string
@@ -260,7 +276,7 @@ def construct_pages(csv_file: str, model_version: str, group_name: str, include_
260276
resolutions,
261277
["Simulation", "Data Size (TB)", "ESGF Links", "HPSS Path"],
262278
f"../docs/source/{model_version}/{group_name}/simulation_data/simulation_table.rst",
263-
[65, 15, 400, 80]
279+
[85, 15, 400, 130]
264280
)
265281
if include_reproduction_scripts:
266282
generate_table(
@@ -277,4 +293,5 @@ def construct_pages(csv_file: str, model_version: str, group_name: str, include_
277293
if __name__ == "__main__":
278294
#construct_pages("simulations_v2.csv", "v2", "WaterCycle")
279295
#construct_pages("simulations_v2_1.csv", "v2.1", "WaterCycle")
280-
construct_pages("simulations_v2_1.csv", "v2.1", "BGC")
296+
#construct_pages("simulations_v2_1.csv", "v2.1", "BGC")
297+
construct_pages("simulations_v1_water_cycle.csv", "v1", "WaterCycle")

0 commit comments

Comments
 (0)