Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
8fe0453
switch hera to ursa
gspetro-NOAA Feb 6, 2026
f782f24
Merge branch 'develop' of https://github.com/ufs-community/land-DA_wo…
gspetro-NOAA Feb 11, 2026
56bcd4a
update Introduction
gspetro-NOAA Feb 12, 2026
7a38f96
change LANDDAROOT to BASEDIR
gspetro-NOAA Feb 12, 2026
5fcc158
update stack/jedi dependencies in tech overview
gspetro-NOAA Feb 12, 2026
0208421
general build/run updates
gspetro-NOAA Feb 12, 2026
cd0b1b0
add draft JCB info
gspetro-NOAA Feb 12, 2026
3a81b6d
add glossary terms
gspetro-NOAA Feb 12, 2026
2b7fa02
Tech Overview edits
gspetro-NOAA Feb 12, 2026
ac73ab5
update build/run sections
gspetro-NOAA Feb 16, 2026
e51a030
update background info
gspetro-NOAA Feb 16, 2026
03782a4
testing ch updates
gspetro-NOAA Feb 16, 2026
0c396e1
reorganize JCB info
gspetro-NOAA Feb 17, 2026
9ae8140
add curly braces around BASEDIR
gspetro-NOAA Feb 17, 2026
abd2169
doc bugfix
gspetro-NOAA Feb 20, 2026
5358e45
Merge branch 'ufs-community:develop' into text/cm-1964
gspetro-NOAA Feb 20, 2026
ef68f13
Rm JCB from chapter list
gspetro-NOAA Feb 23, 2026
59c8917
Rm comment re: failing WE2Es
gspetro-NOAA Feb 23, 2026
72ffd27
fix typo
gspetro-NOAA Feb 23, 2026
c63ed66
Merge branch 'ufs-community:develop' into text/cm-1964
gspetro-NOAA Feb 23, 2026
790cfb2
update Customizing the Wflow ch
gspetro-NOAA Feb 25, 2026
6c3ac58
add vars to CtW ch
gspetro-NOAA Feb 25, 2026
e1bc20d
add/update config definitions
gspetro-NOAA Mar 3, 2026
faf6fbc
update Config Wflow ch
gspetro-NOAA Mar 5, 2026
2dd4159
update Config Wflow ch
gspetro-NOAA Mar 5, 2026
55fa2b5
revamp of IO chapter
gspetro-NOAA Mar 16, 2026
6f8350c
minor updates throught docs
gspetro-NOAA Mar 16, 2026
4da1282
Merge branch 'ufs-community:develop' into text/cm-1964
gspetro-NOAA Mar 16, 2026
e6e9bc4
update jedi/jcb yamls in DASystem ch
gspetro-NOAA Mar 17, 2026
15e9833
DA System & Section 4 updates
gspetro-NOAA Mar 18, 2026
1ab0beb
minor I/O & testing updates
gspetro-NOAA Mar 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/BackgroundInfo/Introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,6 @@ bureau, shall not be used in any manner to imply endorsement of any
commercial product or activity by DOC or the United States Government.

References
*************
============

.. bibliography:: ../references.bib
24 changes: 1 addition & 23 deletions doc/source/BuildingRunningTesting/TestingLandDA.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
Testing the Land DA Workflow
************************************

This chapter provides instructions for using the Land DA CTest suite. These steps are designed for use on :ref:`Level 1 <LevelsOfSupport>` systems (e.g., Ursa and Hercules) and may require significant changes on other systems.
This chapter provides instructions for using the Land DA CTest suite. These steps are designed for use on :ref:`Level 1 <LevelsOfSupport>` systems (e.g., Ursa and Hercules) and may require significant changes on other systems. They cannot be run via container at this time.

.. attention::

Expand Down Expand Up @@ -85,25 +85,3 @@ The bottom of the ``out.ctest`` file will include a message with test results. F
Total Test time (real) = 66.90 sec

If one or more tests fail, users can check the logs at ``${BASEDIR}/land-DA_workflow/sorc/build/Testing/Temporary/LastTest.log`` for more information on the failure.

Running Tests Using a Container
=================================

.. COMMENT: Update this container section for the release

.. attention::

The container CTest functionality has been tested in Jenkins. It should be able to run on a sufficiently large cloud instance. However, it is considered unsupported functionality because it has not been thoroughly tested on the cloud for use by the public.

For containers, the CTest functionality is wrapped in a Dockerfile. Therefore, users will need to build the Dockerfile to run the CTests. Since the Land DA container is quite large, this process can take a long time --- potentially hours. In the future, the development team hopes to simplify and shorten this process.

.. code-block:: console

git clone -b release/public-v2.0.0 --recursive https://github.com/ufs-community/land-DA_workflow.git
cd land-DA_workflow/sorc/test/ci
sudo systemctl start docker
sudo docker build -f Dockerfile -t dockerfile-ci-ctest:release .

.. note::

``sudo`` may not be required in front of the last two commands on all systems.
451 changes: 272 additions & 179 deletions doc/source/CustomizingTheWorkflow/ConfigWorkflow.rst

Large diffs are not rendered by default.

240 changes: 151 additions & 89 deletions doc/source/CustomizingTheWorkflow/DASystem.rst

Large diffs are not rendered by default.

543 changes: 301 additions & 242 deletions doc/source/CustomizingTheWorkflow/InputOutput.rst

Large diffs are not rendered by default.

34 changes: 16 additions & 18 deletions doc/source/Reference/FAQ.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,17 +13,16 @@ Frequently Asked Questions (FAQ)
My tasks went DEAD. Why might this be?
========================================

The most common reason for the first few tasks to go DEAD is an improper path in the ``parm_xml.yaml`` configuration file.
In particular, ``exp_basedir`` must be set to the directory above ``land-DA_workflow``. For example, if ``land-DA_workflow`` resides at ``Users/Jane.Doe/landda/land-DA_workflow``, then ``exp_basedir`` must be set to ``Users/Jane.Doe/landda``. After correcting ``parm_xml.yaml``, users will need to regenerate the workflow XML by running:
The most common reason for the first few tasks to go DEAD is an improper path in the ``config.yaml`` file that gets propagated to the ``land_analysis.xml`` Rocoto XML file.
In particular, ``exp_basedir`` must be set to the directory above ``land-DA_workflow``. For example, if ``land-DA_workflow`` resides at ``Users/Jane.Doe/landda/land-DA_workflow``, then ``exp_basedir`` must be set to ``Users/Jane.Doe/landda``. After correcting ``config.yaml``, users will need to regenerate the workflow XML by running:

.. code-block:: console

uw template render --input-file templates/template.land_analysis.yaml --values-file parm_xml.yaml --output-file land_analysis.yaml
uw rocoto realize --input-file land_analysis.yaml --output-file land_analysis.xml
./setup_wflow_env.py -p=<platform>

Then, rewind the DEAD tasks as described :ref:`below <RestartTask>` using ``rocotorewind``, and use ``rocotorun``/``rocotostat`` to advance/check on the workflow (see :numref:`Section %s <automated-run>` for how to do this).

If the first few tasks run successfully, but future tasks go DEAD, users will need to check the experiment log files, located at ``$EXP_BASEDIR/ptmp/test/com/output/logs``. It may also be useful to check that the JEDI directory and other paths and values are correct in ``parm_xml.yaml``.
If the first few tasks run successfully, but future tasks go DEAD, users will need to check the experiment log files, located at ``$EXP_BASEDIR/ptmp/<envir>/com/output/logs``. It may also be useful to check that the JEDI directory and other paths and values are correct in ``config.yaml``.


.. _RestartTask:
Expand All @@ -37,22 +36,21 @@ On platforms that utilize Rocoto workflow software (including Ursa and Hercules)

$ rocotostat -w land_analysis.xml -d land_analysis.db

CYCLE TASK JOBID STATE EXIT STATUS TRIES DURATION
=======================================================================================
200001030000 prep_obs 61746034 SUCCEEDED 0 1 11.0
200001030000 pre_anal 61746035 SUCCEEDED 0 1 13.0
200001030000 analysis 61746081 SUCCEEDED 0 1 76.0
200001030000 post_anal 61746109 SUCCEEDED 0 1 4.0
200001030000 plot_stats 61746110 SUCCEEDED 0 1 70.0
200001030000 forecast 61746128 DEAD 256 1 -
200001030000 plot_stats - - - - -
CYCLE TASK JOBID STATE EXIT STATUS TRIES DURATION
==============================================================================================
202501190000 jcb 8215490 SUCCEEDED 0 1 6.0
202501190000 prep_data 8215491 SUCCEEDED 0 1 21.0
202501190000 pre_anal 8215492 SUCCEEDED 0 1 6.0
202501190000 analysis 8215496 SUCCEEDED 0 1 152.0
202501190000 post_anal 8215519 SUCCEEDED 0 1 23.0
202501190000 forecast 8215551 DEAD 256 1 -
202501190000 plot_stats - - - - -


This means that the DEAD task has not completed successfully, so the workflow has stopped. Once the issue has been identified and fixed (e.g., by referencing the log files in ``$BASEDIR/ptmp/test/com/output/logs``), users can rewind, or "undo," the failed task using the ``rocotorewind`` command:
This means that the DEAD task has not completed successfully, so the workflow has stopped. Once the issue has been identified and fixed (e.g., by referencing the log files in ``${BASEDIR}/ptmp/<envir>/com/output/logs``), users can rewind, or "undo," the failed task using the ``rocotorewind`` command:

.. code-block:: console

rocotorewind -w land_analysis.xml -d land_analysis.db -v 10 -c 200001030000 -t forecast
rocotorewind -w land_analysis.xml -d land_analysis.db -v 10 -c 202501190000 -t forecast

where ``-c`` specifies the cycle date (first column of ``rocotostat`` output) and ``-t`` represents the task name
(second column of ``rocotostat`` output). This will set the number of tries to 0, as though the task has not been run. After using ``rocotorewind``, the next time ``rocotorun`` is used to advance the workflow, the job will be resubmitted.
Expand All @@ -70,7 +68,7 @@ Workload managers such as Slurm typically have a setting that indicates the mini

On a normal cluster, users can modify the ``slurm.conf`` file, which is often (but not always) found at ``/etc/slurm/slurm.conf``. Then, run ``scontrol reconfigure`` to tell Slurm to have all daemons reload the ``slurm.conf`` file. However, each node will have its own local ``slurm.conf``. The copy on the controller node (where the ``slurmctl`` deamon is running) actually handles the job accounting, so it may be possible to modify only that one.

When working on the cloud, especially in AWS ParallelCluster, it is also important to note whether job accounting is turned on such that the historical database is tracking it. Run ``sacct`` and view the console output. A message stating that "Slurm accounting storage is disabled" indicates that it is not turned on. If it is turned on, Rocoto should be able to find the job with ``sacct`` even if the ``MinJobAge`` time has expired. ``MinJobAge`` is how long it takes for job status to age off of squeue, which relies on a different tracking mechanism. ``sacct`` looks at a database that stores everything that ever ran; however, it is incredibly slow, which it is only used if truly necessary.
When working on the cloud, especially in AWS ParallelCluster, it is also important to note whether job accounting is turned on such that the historical database is tracking it. Run ``sacct`` and view the console output. A message stating that "Slurm accounting storage is disabled" indicates that it is not turned on. If it is turned on, Rocoto should be able to find the job with ``sacct`` even if the ``MinJobAge`` time has expired. ``MinJobAge`` is how long it takes for job status to age off of squeue, which relies on a different tracking mechanism. ``sacct`` looks at a database that stores everything that ever ran; however, it is incredibly slow, which is why it is only used if truly necessary.


My forecast task goes DEAD or UNAVAILABLE, and the log file indicates an issue with the ``SINGULARITYENV_FI_PROVIDER`` and ``SINGULARITYENV_PREPEND_PATH`` variables. Why?
Expand Down
21 changes: 10 additions & 11 deletions doc/source/Reference/Glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ Glossary

DA increment
Analysis increment
A DA increment, or analysis increment, is the difference between a "first guess" of the state of the system (usually a previous model forecast) and the "best guess" of what the actual initial state of the system is (the analysis produced by the DA system). When introducing a new starting state for the model integration, care must be taken to ensure that the initial state is balanced and realistic according to the model equations, otherwise the forecast may be poor or even unstable. The Land DA methods (3D-Var and LETKF-OI) achieve this by minimizing a cost function that accounts for the model state, observations, and the error characteristics present in each.
A DA increment, or analysis increment, is the difference between a "first guess" of the state of the system (usually a previous model forecast) and the "best guess" of what the actual initial state of the system is (the analysis produced by the DA system). When introducing a new starting state for the model integration, care must be taken to ensure that the initial state is balanced and realistic according to the model equations, otherwise the forecast may be poor or even unstable. The Land DA 3D-Var implementation achieves this by minimizing a cost function that accounts for the model state, observations, and the error characteristics present in each. The Land DA LETKF-OI implementation combines the state-dependent background error derived from an ensemble forecast with the observations and their corresponding uncertainties to produce an analysis ensemble (:cite:t:`HuntEtAl2007`, 2007).
Refer to the linked articles for more information on `3D-Var <https://www.ecmwf.int/sites/default/files/elibrary/2003/76079-variational-data-assimiltion-theory-and-overview_0.pdf>`_ and `LETKF-OI <https://doi.org/10.1016/j.physd.2006.11.008>`_ respectively.

DATM
Expand All @@ -72,7 +72,7 @@ Glossary
`Earth System Modeling Framework <https://earthsystemmodeling.org/docs/release/latest/ESMF_usrdoc/>`_. The ESMF defines itself as "a suite of software tools for developing high-performance, multi-component Earth science modeling applications." It is a community-developed software infrastructure for building and coupling models.

ex-scripts
Scripting layer (contained in ``land-DA_workflow/jobs/``) that should be called by a :term:`J-job <J-jobs>` for each workflow component to run a specific task or sub-task in the workflow. The different scripting layers are described in detail in the :nco:`NCO Implementation Standards document <ImplementationStandards.v11.0.0.pdf>`.
Scripting layer (contained in ``land-DA_workflow/scripts/``) that should be called by a :term:`J-job <J-jobs>` for each workflow component to run a specific task or sub-task in the workflow. The different scripting layers are described in detail in the :nco:`NCO Implementation Standards document <ImplementationStandards.v11.0.0.pdf>`.

FMS
The Flexible Modeling System (`FMS <https://www.gfdl.noaa.gov/fms/>`_) is a software framework for supporting the efficient
Expand All @@ -98,11 +98,14 @@ Glossary
The Global Historical Climatology Network (`GHCN <https://www.ncei.noaa.gov/products/land-based-station/global-historical-climatology-network-daily>`_) is "an integrated database of daily climate summaries from land surface stations across the globe.""

GSWP3
The Global Soil Wetness Project Phase 3 dataset is a century-long comprehensive set of data documenting several variables for hydro-energy-eco systems.
The `Global Soil Wetness Project Phase 3 <https://www.isimip.org/gettingstarted/input-data-bias-adjustment/details/4/>`_ dataset is a century-long comprehensive set of data documenting several variables for hydro-energy-eco systems.

HPC
High-Performance Computing.

ICs
Initial conditions.

IMS
The `Interactive Multisensor Snow and Ice Mapping System <https://usicecenter.gov/Products/ImsHome>`_ (IMS) is "an operational software package used to demarcate the presence of snow and ice across the entire northern hemisphere."

Expand All @@ -114,7 +117,7 @@ Glossary

JCB
JEDI Configuration Builder
The JEDI Configuration Builder (JCB) is a python package used to assemble information on :term:`JEDI` algorithms (e.g., letkf-oi, 3dvar) and data assimilation types (e.g., snow, marine, atmosphere) into one convenient YAML file for use in data assimilation applications.
The JEDI Configuration Builder (JCB) is a python package used to assemble information on :term:`JEDI` algorithms (e.g., letkf-oi, 3dvar) and data assimilation types (e.g., snow, land, marine, atmosphere) into one convenient YAML file for use in data assimilation applications.

JEDI
The Joint Effort for Data assimilation Integration (`JEDI <https://www.jcsda.org/jcsda-project-jedi>`_) is a unified and versatile data assimilation (DA) system for Earth System Prediction. It aims to enable efficient research and accelerated transition from research to operations by providing a framework that takes into account all components of the Earth system in a consistent manner. The JEDI software package can run on a variety of platforms and for a variety of purposes, and it is designed to readily accommodate new atmospheric and oceanic models and new observation systems. The `JEDI User's Guide <https://jointcenterforsatellitedataassimilation-jedi-docs.readthedocs-hosted.com/en/latest/>`_ contains extensive information on the software.
Expand All @@ -125,7 +128,7 @@ Glossary
:term:`JCSDA`'s `jedi-bundle <https://github.com/JCSDA/jedi-bundle>`_ repository provides an integrated Earth System data assimilation capability. It combines a variety of :term:`JEDI` components, including :term:`OOPS`, :term:`IODA`, and :term:`UFO`.

LND
The LND experiment configuration uses the :term:`land component` with the :term:`DATM` component.
The LND experiment configuration uses the Noah-MP :term:`land component` with the :term:`DATM` component.

land component
The Noah Multi-Physics (Noah-MP) land surface model (LSM) is an open-source, community-developed LSM that has been incorporated into the UFS Weather Model (WM). It is the UFS WM's land component.
Expand Down Expand Up @@ -161,7 +164,7 @@ Glossary

NUOPC
National Unified Operational Prediction Capability
The `National Unified Operational Prediction Capability <https://earthsystemmodeling.org/nuopc/>`_ is a consortium of Navy, NOAA, and Air Force modelers and their research partners. It aims to advance the weather modeling systems used by meteorologists, mission planners, and decision makers. NUOPC partners are working toward a common model architecture --- a standard way of building models --- in order to make it easier to collaboratively build modeling systems.
The `National Unified Operational Prediction Capability <https://earthsystemmodeling.org/nuopc/>`_ (NUOPC) is a consortium of Navy, NOAA, and Air Force modelers and their research partners. It aims to advance the weather modeling systems used by meteorologists, mission planners, and decision makers. NUOPC partners are working toward a common model architecture --- a standard way of building models --- in order to make it easier to collaboratively build modeling systems.

Noah-MP

Expand All @@ -187,10 +190,6 @@ Glossary
SFCSNO
Global Telecommunication System data available from :term:`GDAS`/:term:`GFS`.

Skylab
`JEDI Skylab <https://www.jcsda.org/jediskylab>`_ is the name for roll-up releases of :term:`JCSDA`'s `jedi-bundle <https://github.com/JCSDA/jedi-bundle>`_ repository.
This software provides an integrated Earth System Data Assimilation capability. JCSDA has tested Skylab capabilities internally via the SkyLab testbed for the following components: atmosphere, land/snow, ocean, sea-ice, aerosols, and atmospheric composition. However, JCSDA plans to stop releasing ``jedi-bundle`` and instead encourage users and developers to move to the ``develop`` branch, which will contain the latest updates.

SMAP
`Soil Moisture Active Passive Data (SMAP) <https://nsidc.org/data/smap/data>`_

Expand Down Expand Up @@ -221,4 +220,4 @@ Glossary

Weather Model
WM
A prognostic model that can be used for short- and medium-range research and operational forecasts. It can be an atmosphere-only model or an atmospheric model coupled with one or more additional components, such as a wave or ocean model. The SRW App uses the `UFS Weather Model <https://github.com/ufs-community/ufs-weather-model/wiki>`_.
A prognostic model that can be used for short- and medium-range research and operational forecasts. It can be run as an atmosphere-only model or as an atmospheric model coupled with one or more additional components, such as a wave or ocean model. The Land DA System uses the `UFS Weather Model <https://github.com/ufs-community/ufs-weather-model/wiki>`_.
Loading
Loading