Skip to content

Conversation

@forsyth2
Copy link
Collaborator

Add v2.1 data. Resolves #48.

@forsyth2 forsyth2 self-assigned this Nov 13, 2024
@forsyth2
Copy link
Collaborator Author

@chengzhuzhang The way I have this set up now:
Screenshot 2024-11-12 at 5 53 20 PM

  • The HPSS path (and consequently the data size) column is empty because /home/projects/e3sm/www/WaterCycle/E3SMv2.1/LR/ doesn't seem to exist on NERSC HPSS. I saw your email about copying data to /home/projects/e3sm/www/WaterCycle/E3SMv2_1/, so I imagine this will work once that data is copied.
  • The ESGF links don't seem to be set up correctly though. They're giving "No Data". E.g., for piControl: https://esgf-node.llnl.gov/search/cmip6/?source_id=E3SM-2.1-0&experiment_id=piControl&variant_label=r1i1p1f1 (CMIP) and https://esgf-node.llnl.gov/search/e3sm/?model_version=2.1_0&experiment=piControl&ensemble_member=ens1 (Native). What needs to be changed in those link paths?
  • I suppressed the creation of a reproduction script table since we only have the original data here.
  • Should we also add the original run scripts to a v2.1 equivalent of https://github.com/E3SM-Project/e3sm_data_docs/tree/main/run_scripts/v2/original? Will those scripts be in the HPSS paths too?

@chengzhuzhang
Copy link
Collaborator

@forsyth2 I restarted copying data after Perlmutter gets back online. Yes, the data is located under: /home/projects/e3sm/www/WaterCycle/E3SMv2_1/, I will let you know once the data is in place.

The data is published through ANL's ESGF node. So the pre-fix for the CMIP urls needs to be https://esgf-node.cels.anl.gov/, also the syntax for each urls are changed. Please find the correct per-simulation urls. We don't have native data published, so no Native urls in this case.

The original run scripts can be found here: https://github.com/E3SM-Project/SimulationScripts/tree/master/archive/v2_1. yes, I suppose we don't need reproduction script for now.

@forsyth2
Copy link
Collaborator Author

the data is located under: /home/projects/e3sm/www/WaterCycle/E3SMv2_1/, I will let you know once the data is in place.

I'm not sure how far along this is, but I'm currently getting the following error, so I can't even check its status.

A:/home/f/forsyth-> ls /home/projects/e3sm/www/WaterCycle/E3SMv2   

/home/projects/e3sm/www/WaterCycle/E3SMv2:
LR/     NARRM/  
A:/home/f/forsyth-> ls /home/projects/e3sm/www/WaterCycle/E3SMv2_1 
*** hpss_Opendir: Access denied [-13: HPSS_EACCES] 
    /home/projects/e3sm/www/WaterCycle/E3SMv2_1

/home/projects/e3sm/www/WaterCycle/E3SMv2_1:

@chengzhuzhang
Copy link
Collaborator

A:/home/f/forsyth-> ls /home/projects/e3sm/www/WaterCycle/E3SMv2_1
*** hpss_Opendir: Access denied [-13: HPSS_EACCES]

I guess this is expected. After all simulations are in place, the permission needs to be reset.

@forsyth2
Copy link
Collaborator Author

forsyth2 commented Nov 15, 2024

@chengzhuzhang Remaining action items after my latest changes, which can be seen at https://portal.nersc.gov/cfs/e3sm/forsyth/data_docs_50/html/v2.1/WaterCycle/simulation_data/simulation_table.html

After all simulations are in place, the permission needs to be reset.

They still might be processing, but it looks like I still can't access the HPSS path, which is why the Data Size and HPSS Path columns are currently empty.

Please find the correct per-simulation urls.

Do you have any guidance on finding the URLs? https://esgf-node.cels.anl.gov/search/e3sm/?model_version=2_1 shows a 404 error, as does simply https://esgf-node.cels.anl.gov/search/e3sm.

The original run scripts can be found here: https://github.com/E3SM-Project/SimulationScripts/tree/master/archive/v2_1.

I've copied just the scripts to run_scripts/v2.1/original. These aren't currently linked anywhere though. We could 1) add the directory link on https://portal.nersc.gov/cfs/e3sm/forsyth/data_docs_50/html/v2.1/WaterCycle/simulation_data/index.html or 2) add the original scripts column to the simulation data table (rather than the reproduction scripts table, which we aren't including here).

@chengzhuzhang
Copy link
Collaborator

yes, the data is still in transfer.

Here is the example url for 1pctCO2: https://esgf-node.cels.anl.gov/search/?project=CMIP6&activeFacets=%7B%22source_id%22%3A%22E3SM-2-1%22%2C%22experiment_id%22%3A%221pctCO2%22%7D
For getting the urls for each simulation:
0 Goto https://esgf-node.cels.anl.gov/

  1. click on the box next to CMIP6
  2. in the facet menu, click on Identifies -> select SourceID= E3SM-2-1 ,
  3. which will filter out all the relevant experiments.
  4. You can find the Copy Search on the upper right corner for cpying the url link.
    See fig below:
Screenshot 2024-11-15 at 11 33 35 AM

For simulation script, either way is fine.

@forsyth2
Copy link
Collaborator Author

For getting the urls for each simulation:

Thanks, the screenshot with directions was very helpful. I'm still having trouble getting the variant labels to work out correctly, but overall it's looking better. I'll keep trying on that.

For simulation script, either way is fine.

Great, I just added a line of text pointing to the directory.

Copy link
Collaborator Author

@forsyth2 forsyth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


The E3SMv2.1 simulation data is available on **ESGF** and **NERSC HPSS**.

The preferred retrieval method is **ESGF**. Native output is available at `ESGF <https://esgf-node.cels.anl.gov/search/e3sm/?model_version=2_1>`_, and a subset of the data is reformatted to conform to CMIP conventions and submmited to the CMIP6 ESGF archive (ESGF links are provided in the table below for published data).
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked link works ✅

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it's https://esgf-node.cels.anl.gov/search?project=CMIP6&activeFacets=%7B%22source_id%22%3A%22E3SM-2-1%22%7D that works; that edit didn't seem to make it into the last commit.


**v2_1.LR** simulations data has been archived on NERSC HPSS under: ::

/home/projects/e3sm/www/WaterCycle/E3SMv2_1
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked HPSS path ✅

/home/projects/e3sm/www/WaterCycle/E3SMv2_1


Original run scripts can be found `here <https://github.com/E3SM-Project/e3sm_data_docs/tree/main/run_scripts/v2.1/original>`_.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link will exist after this PR is merged.

@@ -0,0 +1,10 @@
model_version, group, resolution, category, simulation_name, machine, checksum, experiment, ensemble_num, cmip_only, node,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the v2.1 simulations are defined here.

if experiment and ensemble_num:
def get_esgf(source_id: str, model_version: str, experiment: str, ensemble_num: str, cmip_only: str, node: str) -> str:
if node == "cels.anl":
esgf = f"`CMIP <https://esgf-node.{node}.gov/search/?project=CMIP6&activeFacets=%7B%22source_id%22%3A%22{source_id}%22%2C%22experiment_id%22%3A%22{experiment}%22%2C%22variant_label%22%3A%22r{ensemble_num}i1p1f1%22%7D>`_"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@forsyth2 forsyth2 marked this pull request as ready for review November 18, 2024 21:05
* **v2_1.LR** (lower resolution)

If you use data from this simulation campaign, please cite the relevant overview
manuscripts.
Copy link
Collaborator

@chengzhuzhang chengzhuzhang Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please cite Smith et al. 2024, GMD as the overview reference paper
https://gmd.copernicus.org/preprints/gmd-2024-149/

E3SMv2.1 (Water Cycle)
====================

The simulation compaign for E3SMv2 (Water Cycle) was performed with two
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be modified as The simulation campaign for E3SMv2.1 was performed with Low Resolution (LR) configuration


or ::

zstash extract --hpss=globus://9cd89cfd-6d04-11e5-ba46-22000b92c6ec/<HPSS path below>
Copy link
Collaborator

@chengzhuzhang chengzhuzhang Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

2. To retrieve files with zstash command:
::

zstash extract --hpss=/home/projects/e3sm/www/WaterCycle/E3SMv2_1/LR/v2.LR.piControl "*.elm.h0.00[3-4]?-??.nc"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path is not correct. In addition, there is a type elm (_should be eam) inheriting from v2.

Copy link
Collaborator

@chengzhuzhang chengzhuzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @forsyth2
Thank you for the PR, please see my requested changes.
The zstash instruction is mostly copied from v2, and modified to use v2.1 data path. I think it is okay to keep this as is, but we can re-orgnize at a later time, so that we don't need to repeat each time when new simulation (e.g. v3) is added.

@forsyth2
Copy link
Collaborator Author

@chengzhuzhang
Copy link
Collaborator

Thank you for addressing the comments. The pages looks nicely!

@forsyth2 forsyth2 merged commit 40f78a2 into main Nov 19, 2024
1 check passed
@forsyth2 forsyth2 deleted the issue-48-v2-1-data branch November 19, 2024 17:08

The E3SMv2.1 simulation data is available on **ESGF** and **NERSC HPSS**.

The preferred retrieval method is **ESGF**. Native output is available at `ESGF <https://esgf-node.cels.anl.gov/search?project=CMIP6&activeFacets=%7B%22source_id%22%3A%22E3SM-2-1%22%7D>`_, and a subset of the data is reformatted to conform to CMIP conventions and submmited to the CMIP6 ESGF archive (ESGF links are provided in the table below for published data).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@forsyth2
Sorry that I didn't catch this problem, but found it when reviewing the newsletter article. We did not publish v2.1 native data to ESGF. Could you update this to:
The data is reformatted to conform to CMIP conventions and submmited to the CMIP6 ESGF archive (ESGF links are provided in the table below for published data).

@forsyth2 forsyth2 mentioned this pull request Apr 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add v2.1 simulation info to simulaiton table

3 participants