|
| 1 | +--- |
| 2 | +title: 'visualCaseGen: An SMT-based Experiment Configurator for Community Earth System Model' |
| 3 | +tags: |
| 4 | + - Python |
| 5 | + - cesm |
| 6 | + - climate modeling |
| 7 | + - constraint solver |
| 8 | + - SMT |
| 9 | +authors: |
| 10 | + - name: Alper Altuntas |
| 11 | + orcid: 0000-0003-1708-9518 |
| 12 | + affiliation: "1" |
| 13 | + - name: Manish Venumuddula |
| 14 | + orcid: 0009-0009-5047-2018 |
| 15 | + affiliation: "1" |
| 16 | + - name: Isla R. Simpson |
| 17 | + orcid: 0000-0002-2915-1377 |
| 18 | + affiliation: "1" |
| 19 | + - name: Scott D. Bachman |
| 20 | + orcid: 0000-0002-6479-4300 |
| 21 | + affiliation: "1" |
| 22 | + - name: Samuel Levis |
| 23 | + orcid: 0000-0003-4684-6995 |
| 24 | + affiliation: "1" |
| 25 | + - name: Brian Dobbins |
| 26 | + orcid: 0000-0002-3156-0460 |
| 27 | + affiliation: "1" |
| 28 | + - name: William J. Sacks |
| 29 | + orcid: 0000-0003-2902-5263 |
| 30 | + affiliation: "1" |
| 31 | + - name: Gokhan Danabasoglu |
| 32 | + orcid: 0000-0003-4676-2732 |
| 33 | + affiliation: "1" |
| 34 | + |
| 35 | +affiliations: |
| 36 | + - name: U.S. National Science Foundation National Center for Atmospheric Research (NSF NCAR), Boulder, CO |
| 37 | + index: 1 |
| 38 | +date: 29 June 2025 |
| 39 | +bibliography: paper.bib |
| 40 | + |
| 41 | +--- |
| 42 | + |
| 43 | +# Introduction |
| 44 | + |
| 45 | +visualCaseGen is a graphical user interface (GUI) designed to streamline the |
| 46 | +creation and configuration of Community Earth System Model (CESM) experiments. |
| 47 | +CESM is a highly flexible, state-of-the-art climate model that allows researchers |
| 48 | +to simulate the Earth system across a wide range of spatial and temporal scales |
| 49 | +and at varying levels of complexity, depending on scientific objectives [@danabasoglu2020community]. |
| 50 | +However, configuring non-standard or idealized CESM experiments is often a technically |
| 51 | +demanding and time-consuming process, requiring detailed knowledge of component |
| 52 | +compatibility, grid definitions, parameterization schemes, and model hierarchies. |
| 53 | +visualCaseGen simplifies this process by guiding users through each stage of |
| 54 | +experiment setup and automating key configuration steps via an intuitive, |
| 55 | +interactive interface. |
| 56 | + |
| 57 | +To ensure consistency and compatibility across various model settings, visualCaseGen |
| 58 | +incorporates a constraint solver based on satisfiability modulo theories (SMT) |
| 59 | +[@de2011satisfiability]. This solver systematically analyzes dependencies between |
| 60 | +configuration parameters, detects conflicts, and provides detailed explanations of |
| 61 | +incompatibilities, allowing users to make informed adjustments. The SMT-based |
| 62 | +approach enables dynamic, real-time validation of model settings, significantly |
| 63 | +reducing setup errors and ensuring that only scientifically viable configurations |
| 64 | +are selected. |
| 65 | + |
| 66 | +On the frontend, visualCaseGen is implemented as a Jupyter-based GUI, offering |
| 67 | +an intuitive, step-by-step interface for browsing standard CESM configurations, |
| 68 | +defining custom experiment setups, and modifying grid and component settings. |
| 69 | +Designed with a wizard-like interface, visualCaseGen walks users through each |
| 70 | +stage of the CESM configuration process, ensuring that all necessary settings |
| 71 | +are selected in a logical sequence while dynamically updating available options |
| 72 | +based on user choices. Additionally, the tool features utilities for creating |
| 73 | +and editing model input files, such as ocean grid, bathymetry, and land |
| 74 | +surface properties, further simplifying model customization and |
| 75 | +minimizing the need for manual file creation and modification. |
| 76 | + |
| 77 | +By automating and simplifying CESM configuration, visualCaseGen makes the model |
| 78 | +more accessible and custumizable, particularly for researchers exploring |
| 79 | +hierarchical modeling [@maher2019model], idealized experiments |
| 80 | +[@polvani2017less], or custom coupled simulations. As such, the tool allows |
| 81 | +users to focus on their scientific objectives rather than technical setup |
| 82 | +challenges, ultimately enabling a more efficient and streamlined experiment |
| 83 | +workflow. |
| 84 | + |
| 85 | + |
| 86 | +# Statement of need |
| 87 | + |
| 88 | +CESM is a highly flexible and comprehensive modeling framework that |
| 89 | +encompasses multiple components representing the atmosphere, ocean, land, |
| 90 | +ice, river, biogeochemistry, and other systems. While this |
| 91 | +flexibility supports a wide range of scientific experiments, it also introduces |
| 92 | +significant complexity in model configuration. Setting up custom CESM |
| 93 | +experiments requires navigating intricate component compatibility constraints, |
| 94 | +grid configurations, and parameterization choices, often demanding extensive |
| 95 | +expertise in the model’s internal structure. For non-standard experiments, users |
| 96 | +must manually modify CESM’s codebase and runtime parameter files, maintain |
| 97 | +numerical and scientific consistency, and troubleshoot compatibility issues. |
| 98 | +This process is time-intensive, error-prone, and can take weeks before yielding |
| 99 | +a functional setup. |
| 100 | + |
| 101 | +A recent study by @wu2021coupled exemplifies these challenges. Their work |
| 102 | +involved configuring an idealized CESM experiment to study atmosphere-ocean |
| 103 | +interactions using two simplified aquaplanet models: one without continents and |
| 104 | +another with a pole-to-pole strip continent. The goal was to investigate how |
| 105 | +these configurations influence Hadley circulation, equatorial upwelling, and |
| 106 | +precipitation patterns (\autoref{fig:wuEtAl}). However, setting up these |
| 107 | +experiments required extensive manual intervention, including modifying CESM |
| 108 | +codebase, creating custom input files, adjusting runtime parameters, consulting |
| 109 | +domain experts for component-specific configurations, and conducting numerous |
| 110 | +trial-and-error iterations. visualCaseGen was developed to address such |
| 111 | +usability barriers and streamline CESM experiment setup. As an interactive GUI, |
| 112 | +it eliminates the need for manual modifications and provides an |
| 113 | +intuitive, structured workflow for constructing model configurations. |
| 114 | + |
| 115 | +{height="280pt"} |
| 118 | + |
| 119 | +# Constraint Solver |
| 120 | + |
| 121 | +One of the main challenges in configuring CESM experiments is ensuring that |
| 122 | +different model settings remain compatible. CESM’s configuration involves |
| 123 | +determining components, physics, grids, and parameterization choices, many of |
| 124 | +which have strict compatibility constraints. visualCaseGen addresses this |
| 125 | +challenge by integrating an SMT-based constraint solver, built using the Z3 |
| 126 | +solver [@de2008z3]. Z3 was chosen for its robust Python API and its efficiency |
| 127 | +in managing complex logical relationships involving multiple parameter types |
| 128 | +such as integers, reals, booleans, and strings. As such, Z3 is well suited |
| 129 | +to handling the intricate dependencies and constraints inherent in CESM |
| 130 | +configurations. |
| 131 | + |
| 132 | +In visualCaseGen, constraints are specified as key-value pairs, where the key |
| 133 | +represents a Z3 logical expression defining a condition, and the value is the |
| 134 | +error message displayed when the constraint is violated. These constraints |
| 135 | +enforce compatibility rules and prevent invalid model configurations. Below |
| 136 | +are three examples of visualCaseGen constraints with increasing complexity, |
| 137 | +demonstrating how the SMT solver can enforce simple value bounds, conditional |
| 138 | +dependencies, and more complex multi-component logical rules: |
| 139 | + |
| 140 | +```python |
| 141 | + |
| 142 | +LND_DOM_PFT >= 0.0: |
| 143 | + "PFT/CFT must be set to a nonnegative number", |
| 144 | + |
| 145 | +Implies(OCN_GRID_EXTENT=="Regional", OCN_CYCLIC_X=="False"): |
| 146 | + "Regional ocean domain cannot be reentrant" |
| 147 | + "(due to an ESMF limitation.)", |
| 148 | + |
| 149 | +Implies(And(COMP_OCN=="mom", COMP_LND=="slnd", COMP_ICE=="sice"), |
| 150 | + OCN_LENY<180.0): |
| 151 | + "If LND and ICE are stub, custom MOM6 grid must exclude poles" |
| 152 | + "to avoid singularities in open water", |
| 153 | + |
| 154 | +``` |
| 155 | + |
| 156 | +## Why Use a Constraint Solver? |
| 157 | + |
| 158 | +Configuring CESM is inherently a constraint satisfaction problem (CSP), which |
| 159 | +can quickly become computationally complex as the number of configuration |
| 160 | +variables increases [@biere2009handbook]. Manually enforcing constraints would be impractical, making |
| 161 | +an SMT solver an ideal choice. The benefits of using a solver include: |
| 162 | + |
| 163 | +- **Detecting Hidden Conflicts:** Individual constraints may be satisfied |
| 164 | + independently, yet their combination can lead to conflicts that are nontrivial |
| 165 | + to detect manually. |
| 166 | + |
| 167 | +- **Preventing Dead-Ends:** Without a solver, users may unknowingly select |
| 168 | + settings that lead to an unsatisfiable configuration, forcing them to restart |
| 169 | + their setup. Thanks to the solver, visualCaseGen dynamically guides users |
| 170 | + toward valid options. |
| 171 | + |
| 172 | +- **Enabling Constraint Analysis:** The solver can answer critical questions, such as: |
| 173 | + - Are all constraints satisfiable? |
| 174 | + - Are there unreachable options that need adjustment? |
| 175 | + - Are any constraints redundant and can be optimized? |
| 176 | + |
| 177 | +- **Scalability and Efficiency:** As the number of variables and constraints grows |
| 178 | + exponentially, manually checking compatibility becomes infeasible. The |
| 179 | + solver efficiently handles large-scale constraint resolution, ensuring rapid |
| 180 | + feedback even for large number of variables. |
| 181 | + |
| 182 | + |
| 183 | +# The Stage Mechanism |
| 184 | + |
| 185 | +A key backend concept in visualCaseGen is the Stage Mechanism, which structures |
| 186 | +the CESM configuration process into consecutive steps (stages). Each stage |
| 187 | +includes a set of related configuration variables that can be adjusted together. |
| 188 | +Based on the user's selections, different stages are activated dynamically, |
| 189 | +guiding the user through a structured workflow. |
| 190 | + |
| 191 | +## Stage Pipeline |
| 192 | + |
| 193 | +All possible stage paths collectively form the stage pipeline as shown in |
| 194 | +\autoref{fig:pipeline}, which dictates the sequence in which configuration |
| 195 | +variables are presented to the user, and the *precedence of variables* where |
| 196 | +earlier stages have higher priority over later ones. A complexity arises when |
| 197 | +the same variable appears in multiple stages. This is allowed as long as it is |
| 198 | +not reachable along the same path within the stage pipeline. To prevent cyclic |
| 199 | +dependencies, the stage pipeline must therefore form a directed acyclic graph |
| 200 | +(DAG), enabling a consistent variable precedence hierarchy and eliminating the |
| 201 | +possibility of loops or contradictory variable settings. |
| 202 | + |
| 203 | + |
| 204 | + |
| 205 | +## Constraint Graph and its Traversal |
| 206 | + |
| 207 | +Using the stage pipeline and specified constraints, visualCaseGen constructs a |
| 208 | +constraint graph, as shown in \autoref{fig:cgraph}. In this graph: |
| 209 | + |
| 210 | + - Nodes represent configuration variables. |
| 211 | + - Directed edges represent dependencies or constraints between variables. |
| 212 | + - Edges are directed from higher-precedence variables to lower-precedence variables. |
| 213 | + |
| 214 | +During the configuration process, when a user makes a selection, the constraint |
| 215 | +graph is traversed to identify all variables that are affected by the selection. |
| 216 | +This traversal is done in a breadth-first manner, starting from the selected |
| 217 | +variable and following the edges in the direction of the constraints. The |
| 218 | +traversal stops at variables whose options' validities are not affected by the |
| 219 | +selection. As such the traversal is limited to the variables that are directly |
| 220 | +or indirectly affected by the user's selection, which in turn depends on the the |
| 221 | +user input, stage hierarchy, and the specified constraints. By dynamically |
| 222 | +re-evaluating constraints and adjusting available options, visualCaseGen |
| 223 | +provides real-time feedback, preventing invalid configurations and ensuring |
| 224 | +scientific consistency in CESM setups. |
| 225 | + |
| 226 | + |
| 227 | + |
| 228 | +# Frontend |
| 229 | + |
| 230 | +The visualCaseGen frontend provides an intuitive and interactive interface for |
| 231 | +configuring CESM experiments. Built with Jupyter ipywidgets [@ipywidgets], it |
| 232 | +can operate on local |
| 233 | +machines, HPC clusters, and cloud environments. This portability and flexibility allows |
| 234 | +researchers to configure CESM experiments efficiently, whether prototyping |
| 235 | +lightweight simulations on personal computers or running sophisticated applications on remote supercomputing |
| 236 | +systems. |
| 237 | + |
| 238 | +\autoref{fig:Stage1_7} displays an example stage from the visualCaseGen GUI, |
| 239 | +where users can select the individual models to be coupled in their CESM |
| 240 | +experiment. As the user makes selections, the GUI dynamically updates available |
| 241 | +options by crossing out incompatible choices, ensuring that only valid |
| 242 | +configurations are presented. This interactive feedback mechanism guides users |
| 243 | +through the configuration process, helping them make informed decisions and |
| 244 | +avoiding incompatible selections. |
| 245 | + |
| 246 | +{width="90%"} |
| 247 | + |
| 248 | +At any stage, users can click on crossed-out options to view a |
| 249 | +brief explanation of why a particular choice is incompatible with their current |
| 250 | +selections, as illustrated in \autoref{fig:Stage1_8}. This helps guide |
| 251 | +users through complex dependencies, and helping them make informed |
| 252 | +adjustments. |
| 253 | + |
| 254 | +{width="90%"} |
| 255 | + |
| 256 | +As another example of streamlining model customization, \autoref{fig:TopoEditor} |
| 257 | +shows the TopoEditor widget that comes with visualCaseGen. This tool allows users |
| 258 | +to interactively modify ocean bathymetry, enhancing customizability and |
| 259 | +ease of use. |
| 260 | + |
| 261 | +{width="90%"} |
| 262 | + |
| 263 | +# Remarks |
| 264 | + |
| 265 | +visualCaseGen can significantly accelerate CESM experiment setup for a wide |
| 266 | +range of studies by automating many aspects of experiment configuration. Instead |
| 267 | +of manual intervention, modelers can use visualCaseGen’s interactive |
| 268 | +GUI to define model setups, mix and match component configurations, |
| 269 | +generate custom grids and parameters. The SMT-based constraint solver |
| 270 | +ensures that only valid model settings are selected, reducing the need for |
| 271 | +trial-and-error debugging. While complex custom cases may still require |
| 272 | +fine-tuning, visualCaseGen allows modelers to generate an initial working |
| 273 | +configuration in a matter of hours rather than weeks, greatly improving |
| 274 | +efficiency and ease-of-use. |
| 275 | + |
| 276 | +By automating tedious configuration tasks, visualCaseGen enables researchers to |
| 277 | +focus on scientific exploration rather than technical setup, making CESM more |
| 278 | +accessible for both idealized and complex climate modeling studies. |
| 279 | + |
| 280 | +# Acknowledgements |
| 281 | + |
| 282 | +This work was supported by the NSF Cyberinfrastructure for Sustained Scientific |
| 283 | +Innovation (CSSI) program under award number 2004575. Special thanks to the |
| 284 | +CESM Software Engineering Working Group for their support and feedback during |
| 285 | +the development of visualCaseGen. |
| 286 | + |
| 287 | +The NSF National Center for Atmospheric Research (NCAR) is a major facility |
| 288 | +sponsored by the NSF under Cooperative Agreement No. 1852977. |
| 289 | + |
| 290 | +# References |
| 291 | + |
0 commit comments