Skip to content

Commit 6475ff2

Browse files
Ai training datasets docs (#68)
* add doc for AI training data * add details on data discription * refinement * address review
1 parent 688878a commit 6475ff2

File tree

4 files changed

+58
-0
lines changed

4 files changed

+58
-0
lines changed

docs/source/AITraining/index.rst

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
AI Training Datasets
2+
====================
3+
4+
The E3SM project and `Allen Institute for AI (Ai2) <https://allenai.org/>`_ have developed several datasets for AI and machine learning applications. These datasets have been postprocessed for ingestion by the `ACE <https://github.com/ai2cm/ace?tab=readme-ov-file#ai2-climate-emulator>`_/`FourCastNet <https://github.com/NVlabs/FourCastNet>`_ emulator.
5+
6+
Dataset Details
7+
***************
8+
9+
- **EAMv2**: 73-year EAMv2 simulation (F2010, perpetual 2010 forcing, repeating annual SST cycle from 2005-2014 average). 6-hourly outputs. More details see: `Duncan et al. 2024 <https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2024JH000136>`_
10+
11+
- **EAMv3**: 51-year EAMv3 AMIP-style simulation (1970-2020, F2010 with AMIP SSTs, constant 2010 CO2). Includes multiple ENSO cycles and global warming trend. More details see: `Wu et al. 2025 <https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2025JH000774>`_
12+
13+
- **E3SMv3**: Coupled pre-industrial and historical training data (coming soon)
14+
15+
- **SCREAMv1**: Simple Cloud-Resolving E3SM Atmosphere Model version 1 training data (coming soon)
16+
17+
.. tip::
18+
Check the ``archive_contents`` text file to see files included in each tar archive. You can selectively download the files you need.
19+
20+
Data Access
21+
***********
22+
23+
.. toctree::
24+
:maxdepth: 2
25+
26+
simulation_data/simulation_table
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
***************
2+
Simulation Data
3+
***************
4+
5+
.. toctree::
6+
:maxdepth: 2
7+
8+
simulation_table
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
********************************************
2+
AI Training Datasets simulation table
3+
********************************************
4+
5+
+-------------------------------------------------------------------+-----------------+---------------------------------------------------------------------------+-------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+
6+
| Dataset | Status | Data Size | HPSS Path | HPSS URL |
7+
+===================================================================+=================+===========================================================================+===============================================================================+=====================================================================================================================+
8+
| **EAMv2** | | | | |
9+
+-------------------------------------------------------------------+-----------------+---------------------------------------------------------------------------+-------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+
10+
| EAMv2 AI Training Dataset | Available | 1.2T | /home/projects/e3sm/www/AI_training_data/e3sm-v2-climsst-180x360-gaussian | `Link <https://portal.nersc.gov/archive/home/projects/e3sm/www/AI_training_data/e3sm-v2-climsst-180x360-gaussian>`_ |
11+
+-------------------------------------------------------------------+-----------------+---------------------------------------------------------------------------+-------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+
12+
| **EAMv3** | | | | |
13+
+-------------------------------------------------------------------+-----------------+---------------------------------------------------------------------------+-------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+
14+
| EAMv3 AI Training Dataset | Available | 1.3T | /home/projects/e3sm/www/AI_training_data/e3sm-v3-amip-180x360-gaussian | `Link <https://portal.nersc.gov/archive/home/projects/e3sm/www/AI_training_data/e3sm-v3-amip-180x360-gaussian>`_ |
15+
+-------------------------------------------------------------------+-----------------+---------------------------------------------------------------------------+-------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+
16+
| **E3SMv3** | | | | |
17+
+-------------------------------------------------------------------+-----------------+---------------------------------------------------------------------------+-------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+
18+
| E3SMv3 Coupled AI Training Dataset | Coming Soon | TBD | TBD | TBD |
19+
+-------------------------------------------------------------------+-----------------+---------------------------------------------------------------------------+-------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+
20+
| **SCREAMv1** | | | | |
21+
+-------------------------------------------------------------------+-----------------+---------------------------------------------------------------------------+-------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+
22+
| SCREAMv1 AI Training Dataset | Coming Soon | TBD | TBD | TBD |
23+
+-------------------------------------------------------------------+-----------------+---------------------------------------------------------------------------+-------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------+

docs/source/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ simulations.
2121
v3/index
2222
SCREAMv0/index
2323
SCREAMv1/index
24+
AITraining/index
2425

2526

2627

0 commit comments

Comments
 (0)