Skip to content

Commit beda7d3

Browse files
Paper edits
Add JOSS paper draft
1 parent d436e28 commit beda7d3

File tree

9 files changed

+394
-0
lines changed

9 files changed

+394
-0
lines changed

.github/workflows/draft-pdf.yml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
name: Draft PDF
2+
on: [push]
3+
4+
jobs:
5+
paper:
6+
runs-on: ubuntu-latest
7+
name: Paper Draft
8+
steps:
9+
- name: Checkout
10+
uses: actions/checkout@v4
11+
- name: Build draft PDF
12+
uses: openjournals/openjournals-draft-action@master
13+
with:
14+
journal: joss
15+
# This should be the path to the paper within your repo.
16+
paper-path: paper/paper.md
17+
- name: Upload
18+
uses: actions/upload-artifact@v4
19+
with:
20+
name: paper
21+
# This is the output path where Pandoc will write the compiled
22+
# PDF. Note, this should be the same directory as the input
23+
# paper.md
24+
path: paper/paper.pdf

paper/Stage1_7.png

68.4 KB
Loading

paper/Stage1_8.png

90 KB
Loading

paper/TopoEditor.png

304 KB
Loading

paper/cgraph.png

138 KB
Loading

paper/paper.bib

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
@inproceedings{de2008z3,
2+
title={Z3: An efficient SMT solver},
3+
author={De Moura, Leonardo and Bj{\o}rner, Nikolaj},
4+
booktitle={International conference on Tools and Algorithms for the Construction and Analysis of Systems},
5+
pages={337--340},
6+
year={2008},
7+
organization={Springer}
8+
}
9+
10+
@article{danabasoglu2020community,
11+
title={The community earth system model version 2 (CESM2)},
12+
author={Danabasoglu, Gokhan and Lamarque, J-F and Bacmeister, J and Bailey, DA and DuVivier, AK and Edwards, Jim and Emmons, LK and Fasullo, John and Garcia, R and Gettelman, Andrew and others},
13+
journal={Journal of Advances in Modeling Earth Systems},
14+
volume={12},
15+
number={2},
16+
pages={e2019MS001916},
17+
year={2020},
18+
publisher={Wiley Online Library}
19+
}
20+
21+
@misc{ipywidgets,
22+
author = {Jupyter},
23+
title = {ipywidgets: Interactive HTML Widgets for Jupyter Notebooks},
24+
year = {2015},
25+
publisher = {GitHub},
26+
journal = {GitHub repository},
27+
howpublished = {\url{https://github.com/jupyter-widgets/ipywidgets}},
28+
}
29+
30+
@article{wu2021coupled,
31+
title={Coupled aqua and ridge planets in the community earth system model},
32+
author={Wu, Xiaoning and Reed, Kevin A and Wolfe, Christopher LP and Marques, Gustavo M and Bachman, Scott D and Bryan, Frank O},
33+
journal={Journal of Advances in Modeling Earth Systems},
34+
volume={13},
35+
number={4},
36+
pages={e2020MS002418},
37+
year={2021},
38+
publisher={Wiley Online Library}
39+
}
40+
41+
@article{polvani2017less,
42+
title={When less is more: Opening the door to simpler climate models, Eos, 98},
43+
author={Polvani, LM and Clement, AC and Medeiros, B and Benedict, JJ and Simpson, IR},
44+
journal={Eos, Transactions American Geophysical Union},
45+
volume={99},
46+
number={3},
47+
pages={15--16},
48+
year={2017}
49+
}
50+
51+
@article{maher2019model,
52+
title={Model hierarchies for understanding atmospheric circulation},
53+
author={Maher, Penelope and Gerber, Edwin P and Medeiros, Brian and Merlis, Timothy M and Sherwood, Steven and Sheshadri, Aditi and Sobel, Adam H and Vallis, Geoffrey K and Voigt, Aiko and Zurita-Gotor, Pablo},
54+
journal={Reviews of Geophysics},
55+
volume={57},
56+
number={2},
57+
pages={250--280},
58+
year={2019},
59+
publisher={Wiley Online Library}
60+
}
61+
62+
@article{de2011satisfiability,
63+
title={Satisfiability modulo theories: introduction and applications},
64+
author={De Moura, Leonardo and Bj{\o}rner, Nikolaj},
65+
journal={Communications of the ACM},
66+
volume={54},
67+
number={9},
68+
pages={69--77},
69+
year={2011},
70+
publisher={ACM New York, NY, USA}
71+
}
72+
73+
@book{biere2009handbook,
74+
title={Handbook of satisfiability},
75+
author={Biere, Armin and Heule, Marijn and van Maaren, Hans},
76+
volume={185},
77+
year={2009},
78+
publisher={IOS press}
79+
}

paper/paper.md

Lines changed: 291 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,291 @@
1+
---
2+
title: 'visualCaseGen: An SMT-based Experiment Configurator for Community Earth System Model'
3+
tags:
4+
- Python
5+
- cesm
6+
- climate modeling
7+
- constraint solver
8+
- SMT
9+
authors:
10+
- name: Alper Altuntas
11+
orcid: 0000-0003-1708-9518
12+
affiliation: "1"
13+
- name: Manish Venumuddula
14+
orcid: 0009-0009-5047-2018
15+
affiliation: "1"
16+
- name: Isla R. Simpson
17+
orcid: 0000-0002-2915-1377
18+
affiliation: "1"
19+
- name: Scott D. Bachman
20+
orcid: 0000-0002-6479-4300
21+
affiliation: "1"
22+
- name: Samuel Levis
23+
orcid: 0000-0003-4684-6995
24+
affiliation: "1"
25+
- name: Brian Dobbins
26+
orcid: 0000-0002-3156-0460
27+
affiliation: "1"
28+
- name: William J. Sacks
29+
orcid: 0000-0003-2902-5263
30+
affiliation: "1"
31+
- name: Gokhan Danabasoglu
32+
orcid: 0000-0003-4676-2732
33+
affiliation: "1"
34+
35+
affiliations:
36+
- name: U.S. National Science Foundation National Center for Atmospheric Research (NSF NCAR), Boulder, CO
37+
index: 1
38+
date: 29 June 2025
39+
bibliography: paper.bib
40+
41+
---
42+
43+
# Introduction
44+
45+
visualCaseGen is a graphical user interface (GUI) designed to streamline the
46+
creation and configuration of Community Earth System Model (CESM) experiments.
47+
CESM is a highly flexible, state-of-the-art climate model that allows researchers
48+
to simulate the Earth system across a wide range of spatial and temporal scales
49+
and at varying levels of complexity, depending on scientific objectives [@danabasoglu2020community].
50+
However, configuring non-standard or idealized CESM experiments is often a technically
51+
demanding and time-consuming process, requiring detailed knowledge of component
52+
compatibility, grid definitions, parameterization schemes, and model hierarchies.
53+
visualCaseGen simplifies this process by guiding users through each stage of
54+
experiment setup and automating key configuration steps via an intuitive,
55+
interactive interface.
56+
57+
To ensure consistency and compatibility across various model settings, visualCaseGen
58+
incorporates a constraint solver based on satisfiability modulo theories (SMT)
59+
[@de2011satisfiability]. This solver systematically analyzes dependencies between
60+
configuration parameters, detects conflicts, and provides detailed explanations of
61+
incompatibilities, allowing users to make informed adjustments. The SMT-based
62+
approach enables dynamic, real-time validation of model settings, significantly
63+
reducing setup errors and ensuring that only scientifically viable configurations
64+
are selected.
65+
66+
On the frontend, visualCaseGen is implemented as a Jupyter-based GUI, offering
67+
an intuitive, step-by-step interface for browsing standard CESM configurations,
68+
defining custom experiment setups, and modifying grid and component settings.
69+
Designed with a wizard-like interface, visualCaseGen walks users through each
70+
stage of the CESM configuration process, ensuring that all necessary settings
71+
are selected in a logical sequence while dynamically updating available options
72+
based on user choices. Additionally, the tool features utilities for creating
73+
and editing model input files, such as ocean grid, bathymetry, and land
74+
surface properties, further simplifying model customization and
75+
minimizing the need for manual file creation and modification.
76+
77+
By automating and simplifying CESM configuration, visualCaseGen makes the model
78+
more accessible and custumizable, particularly for researchers exploring
79+
hierarchical modeling [@maher2019model], idealized experiments
80+
[@polvani2017less], or custom coupled simulations. As such, the tool allows
81+
users to focus on their scientific objectives rather than technical setup
82+
challenges, ultimately enabling a more efficient and streamlined experiment
83+
workflow.
84+
85+
86+
# Statement of need
87+
88+
CESM is a highly flexible and comprehensive modeling framework that
89+
encompasses multiple components representing the atmosphere, ocean, land,
90+
ice, river, biogeochemistry, and other systems. While this
91+
flexibility supports a wide range of scientific experiments, it also introduces
92+
significant complexity in model configuration. Setting up custom CESM
93+
experiments requires navigating intricate component compatibility constraints,
94+
grid configurations, and parameterization choices, often demanding extensive
95+
expertise in the model’s internal structure. For non-standard experiments, users
96+
must manually modify CESM’s codebase and runtime parameter files, maintain
97+
numerical and scientific consistency, and troubleshoot compatibility issues.
98+
This process is time-intensive, error-prone, and can take weeks before yielding
99+
a functional setup.
100+
101+
A recent study by @wu2021coupled exemplifies these challenges. Their work
102+
involved configuring an idealized CESM experiment to study atmosphere-ocean
103+
interactions using two simplified aquaplanet models: one without continents and
104+
another with a pole-to-pole strip continent. The goal was to investigate how
105+
these configurations influence Hadley circulation, equatorial upwelling, and
106+
precipitation patterns (\autoref{fig:wuEtAl}). However, setting up these
107+
experiments required extensive manual intervention, including modifying CESM
108+
codebase, creating custom input files, adjusting runtime parameters, consulting
109+
domain experts for component-specific configurations, and conducting numerous
110+
trial-and-error iterations. visualCaseGen was developed to address such
111+
usability barriers and streamline CESM experiment setup. As an interactive GUI,
112+
it eliminates the need for manual modifications and provides an
113+
intuitive, structured workflow for constructing model configurations.
114+
115+
![Sea surface temperature and precipitable water distribution from Aqua and
116+
Ridge planet simulations using CESM. Adapted from @wu2021coupled.
117+
\label{fig:wuEtAl}](wuEtAl.png){height="280pt"}
118+
119+
# Constraint Solver
120+
121+
One of the main challenges in configuring CESM experiments is ensuring that
122+
different model settings remain compatible. CESM’s configuration involves
123+
determining components, physics, grids, and parameterization choices, many of
124+
which have strict compatibility constraints. visualCaseGen addresses this
125+
challenge by integrating an SMT-based constraint solver, built using the Z3
126+
solver [@de2008z3]. Z3 was chosen for its robust Python API and its efficiency
127+
in managing complex logical relationships involving multiple parameter types
128+
such as integers, reals, booleans, and strings. As such, Z3 is well suited
129+
to handling the intricate dependencies and constraints inherent in CESM
130+
configurations.
131+
132+
In visualCaseGen, constraints are specified as key-value pairs, where the key
133+
represents a Z3 logical expression defining a condition, and the value is the
134+
error message displayed when the constraint is violated. These constraints
135+
enforce compatibility rules and prevent invalid model configurations. Below
136+
are three examples of visualCaseGen constraints with increasing complexity,
137+
demonstrating how the SMT solver can enforce simple value bounds, conditional
138+
dependencies, and more complex multi-component logical rules:
139+
140+
```python
141+
142+
LND_DOM_PFT >= 0.0:
143+
"PFT/CFT must be set to a nonnegative number",
144+
145+
Implies(OCN_GRID_EXTENT=="Regional", OCN_CYCLIC_X=="False"):
146+
"Regional ocean domain cannot be reentrant"
147+
"(due to an ESMF limitation.)",
148+
149+
Implies(And(COMP_OCN=="mom", COMP_LND=="slnd", COMP_ICE=="sice"),
150+
OCN_LENY<180.0):
151+
"If LND and ICE are stub, custom MOM6 grid must exclude poles"
152+
"to avoid singularities in open water",
153+
154+
```
155+
156+
## Why Use a Constraint Solver?
157+
158+
Configuring CESM is inherently a constraint satisfaction problem (CSP), which
159+
can quickly become computationally complex as the number of configuration
160+
variables increases [@biere2009handbook]. Manually enforcing constraints would be impractical, making
161+
an SMT solver an ideal choice. The benefits of using a solver include:
162+
163+
- **Detecting Hidden Conflicts:** Individual constraints may be satisfied
164+
independently, yet their combination can lead to conflicts that are nontrivial
165+
to detect manually.
166+
167+
- **Preventing Dead-Ends:** Without a solver, users may unknowingly select
168+
settings that lead to an unsatisfiable configuration, forcing them to restart
169+
their setup. Thanks to the solver, visualCaseGen dynamically guides users
170+
toward valid options.
171+
172+
- **Enabling Constraint Analysis:** The solver can answer critical questions, such as:
173+
- Are all constraints satisfiable?
174+
- Are there unreachable options that need adjustment?
175+
- Are any constraints redundant and can be optimized?
176+
177+
- **Scalability and Efficiency:** As the number of variables and constraints grows
178+
exponentially, manually checking compatibility becomes infeasible. The
179+
solver efficiently handles large-scale constraint resolution, ensuring rapid
180+
feedback even for large number of variables.
181+
182+
183+
# The Stage Mechanism
184+
185+
A key backend concept in visualCaseGen is the Stage Mechanism, which structures
186+
the CESM configuration process into consecutive steps (stages). Each stage
187+
includes a set of related configuration variables that can be adjusted together.
188+
Based on the user's selections, different stages are activated dynamically,
189+
guiding the user through a structured workflow.
190+
191+
## Stage Pipeline
192+
193+
All possible stage paths collectively form the stage pipeline as shown in
194+
\autoref{fig:pipeline}, which dictates the sequence in which configuration
195+
variables are presented to the user, and the *precedence of variables* where
196+
earlier stages have higher priority over later ones. A complexity arises when
197+
the same variable appears in multiple stages. This is allowed as long as it is
198+
not reachable along the same path within the stage pipeline. To prevent cyclic
199+
dependencies, the stage pipeline must therefore form a directed acyclic graph
200+
(DAG), enabling a consistent variable precedence hierarchy and eliminating the
201+
possibility of loops or contradictory variable settings.
202+
203+
![The visualCaseGen stage pipeline, starting from the top node (1. Component Set) and ending at the bottom node (3. Launch). The user follows a path along this pipeline based on their modeling needs and selections. \label{fig:pipeline}](stage_pipeline.png)
204+
205+
## Constraint Graph and its Traversal
206+
207+
Using the stage pipeline and specified constraints, visualCaseGen constructs a
208+
constraint graph, as shown in \autoref{fig:cgraph}. In this graph:
209+
210+
- Nodes represent configuration variables.
211+
- Directed edges represent dependencies or constraints between variables.
212+
- Edges are directed from higher-precedence variables to lower-precedence variables.
213+
214+
During the configuration process, when a user makes a selection, the constraint
215+
graph is traversed to identify all variables that are affected by the selection.
216+
This traversal is done in a breadth-first manner, starting from the selected
217+
variable and following the edges in the direction of the constraints. The
218+
traversal stops at variables whose options' validities are not affected by the
219+
selection. As such the traversal is limited to the variables that are directly
220+
or indirectly affected by the user's selection, which in turn depends on the the
221+
user input, stage hierarchy, and the specified constraints. By dynamically
222+
re-evaluating constraints and adjusting available options, visualCaseGen
223+
provides real-time feedback, preventing invalid configurations and ensuring
224+
scientific consistency in CESM setups.
225+
226+
![The visualCaseGen constraint graph. \label{fig:cgraph}](cgraph.png)
227+
228+
# Frontend
229+
230+
The visualCaseGen frontend provides an intuitive and interactive interface for
231+
configuring CESM experiments. Built with Jupyter ipywidgets [@ipywidgets], it
232+
can operate on local
233+
machines, HPC clusters, and cloud environments. This portability and flexibility allows
234+
researchers to configure CESM experiments efficiently, whether prototyping
235+
lightweight simulations on personal computers or running sophisticated applications on remote supercomputing
236+
systems.
237+
238+
\autoref{fig:Stage1_7} displays an example stage from the visualCaseGen GUI,
239+
where users can select the individual models to be coupled in their CESM
240+
experiment. As the user makes selections, the GUI dynamically updates available
241+
options by crossing out incompatible choices, ensuring that only valid
242+
configurations are presented. This interactive feedback mechanism guides users
243+
through the configuration process, helping them make informed decisions and
244+
avoiding incompatible selections.
245+
246+
![The "Components" stage. \label{fig:Stage1_7}](Stage1_7.png){width="90%"}
247+
248+
At any stage, users can click on crossed-out options to view a
249+
brief explanation of why a particular choice is incompatible with their current
250+
selections, as illustrated in \autoref{fig:Stage1_8}. This helps guide
251+
users through complex dependencies, and helping them make informed
252+
adjustments.
253+
254+
![Interactive feedback in incompatible choices. \label{fig:Stage1_8}](Stage1_8.png){width="90%"}
255+
256+
As another example of streamlining model customization, \autoref{fig:TopoEditor}
257+
shows the TopoEditor widget that comes with visualCaseGen. This tool allows users
258+
to interactively modify ocean bathymetry, enhancing customizability and
259+
ease of use.
260+
261+
![TopoEditor widget \label{fig:TopoEditor}](TopoEditor.png){width="90%"}
262+
263+
# Remarks
264+
265+
visualCaseGen can significantly accelerate CESM experiment setup for a wide
266+
range of studies by automating many aspects of experiment configuration. Instead
267+
of manual intervention, modelers can use visualCaseGen’s interactive
268+
GUI to define model setups, mix and match component configurations,
269+
generate custom grids and parameters. The SMT-based constraint solver
270+
ensures that only valid model settings are selected, reducing the need for
271+
trial-and-error debugging. While complex custom cases may still require
272+
fine-tuning, visualCaseGen allows modelers to generate an initial working
273+
configuration in a matter of hours rather than weeks, greatly improving
274+
efficiency and ease-of-use.
275+
276+
By automating tedious configuration tasks, visualCaseGen enables researchers to
277+
focus on scientific exploration rather than technical setup, making CESM more
278+
accessible for both idealized and complex climate modeling studies.
279+
280+
# Acknowledgements
281+
282+
This work was supported by the NSF Cyberinfrastructure for Sustained Scientific
283+
Innovation (CSSI) program under award number 2004575. Special thanks to the
284+
CESM Software Engineering Working Group for their support and feedback during
285+
the development of visualCaseGen.
286+
287+
The NSF National Center for Atmospheric Research (NCAR) is a major facility
288+
sponsored by the NSF under Cooperative Agreement No. 1852977.
289+
290+
# References
291+

paper/stage_pipeline.png

117 KB
Loading

paper/wuEtAl.png

963 KB
Loading

0 commit comments

Comments
 (0)