Skip to content

Commit 295d65a

Browse files
committed
Add beginning of deployment and testing docs
1 parent 074d2bb commit 295d65a

File tree

4 files changed

+311
-2
lines changed

4 files changed

+311
-2
lines changed

docs/releasing/release-workflow.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,7 @@ The release process typically follows this progression:
1717
1. **[How Conda and Spack Work Together in E3SM-Unified](conda-vs-spack.md)**
1818
2. **[Planning Package Updates](planning-updates.md)**
1919
3. **[Creating Release Candidates](creating-rcs/overview.md)**
20-
4. **[Deploying on HPCs for Testing](deploying-testing.md)**
21-
5. **[Testing Across the Ecosystem](testing-ecosystem.md)**
20+
4. **[Deployment and Testing](testing/overview.md)**
2221
6. **[Finalizing the Release](finalizing-release.md)**
2322

2423
Each of these steps is detailed in its own page. See below for a high-level
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
# Updating `mache`
2+
3+
`mache` is the configuration library used by E3SM-Unified (and related
4+
projects like Polaris and Compass) to determine machine-specific settings,
5+
including module environments and Spack configurations.
6+
7+
During each E3SM-Unified release, it is often necessary to:
8+
9+
* Add support for new machines
10+
* Update Spack environment templates for existing systems
11+
* Create release candidates and final versions of `mache`
12+
13+
This page outlines the steps for maintaining and updating `mache` during the
14+
release process.
15+
16+
---
17+
18+
## Repo Location
19+
20+
🔗 [https://github.com/E3SM-Project/mache](https://github.com/E3SM-Project/mache)
21+
22+
---
23+
24+
## When to Update `mache`
25+
26+
You should update `mache` when:
27+
28+
* A supported machine has changed modules or compilers
29+
* New machines are being targeted for deployment
30+
* Spack YAML templates fall out of sync with system configurations
31+
* You need to test new combinations of compiler + MPI + module environments
32+
33+
Each change should be tested by deploying a release candidate of E3SM-Unified.
34+
35+
---
36+
37+
## Key Tasks
38+
39+
### 1. Edit Spack Templates
40+
41+
Spack environment templates live in:
42+
43+
```
44+
mache/spack/templates/<machine>_<compiler>_<mpi>.yaml
45+
```
46+
47+
Edit these files to reflect updated system modules or new toolchains.
48+
If adding a new machine, copy an existing `yaml` file to use as a template.
49+
50+
Use the utility script to assist:
51+
🔗 [utils/update_cime_machine_config.py README](https://github.com/E3SM-Project/mache/blob/main/utils/README.md)
52+
53+
This script can be used to download the latest version of the
54+
`config_machines.xml` file from E3SM's master branch, then compare it to the
55+
previous version stored in `mache`, showing changes related to supported
56+
machines.
57+
58+
You should make the changes associated with the differences that this utility
59+
displays in the appropriate `mache/spack/templates` files. You should then copy `new_config_machines.xml` into `mache/cime_machine_config/config_machines.xml`
60+
as the new reference set of machine configurations that `mache` is in sync
61+
with.
62+
63+
---
64+
65+
### 2. Create a Release Candidate
66+
67+
Use the typical GitHub flow:
68+
69+
```bash
70+
git checkout -b update-to-1.32.0
71+
# Make changes
72+
# Push branch and open PR
73+
```
74+
75+
Once the PR is reviewed and merged:
76+
77+
* Tag a release candidate (e.g., `1.32.0rc1`)
78+
* Publish it to conda-forge under `mache_dev` (by merging a PR that targets
79+
the `dev` branch)
80+
81+
This RC will be referenced in the E3SM-Unified build process.
82+
83+
**Note:** As we will discuss later, it is also possible to test E3SM-Unified
84+
with a development branch of `mache` available on GitHub. However, it is
85+
always cleaner to use a release candidate.
86+
87+
---
88+
89+
### 3. Finalize the Release
90+
91+
Once testing across all platforms is complete:
92+
93+
* Create a final version tag (e.g., `1.32.0`)
94+
* Always use [semantic versioning](https://semver.org/)
95+
* Submit a PR to `mache-feedstock` to update the recipe (this time targeting
96+
the `main` branch)
97+
* Merge once CI passes
98+
99+
Afterward, update any references to the RC version in the E3SM-Unified repo to
100+
point to the final release.
101+
102+
---
103+
104+
## Best Practices
105+
106+
* Be liberal in what system tools (`tar`, `CMake`, etc.) are defined as
107+
`buildable: false` in Spack environments. Anything Spack doesn't have to
108+
build saves time and avoids potential build errors due to inconsistent
109+
toolchain assumptions.
110+
* Regularly sync templates with actual E3SM production configurations
111+
* Validate changes via test deployments of E3SM-Unified (or Polaris or Compass)
112+
before tagging final versions
113+
114+
---
115+
116+
➡ Next: [Deploying on HPCs](deploying-on-hpcs.md)

docs/releasing/testing/overview.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# Deployment and Testing Overview
2+
3+
Once a release candidate (RC) of E3SM-Unified has been successfully built, it
4+
must be thoroughly tested across supported HPC systems before a full release
5+
can occur. This phase ensures compatibility with system modules,
6+
performance-critical tools, and real-world analysis workflows.
7+
8+
This section documents the full testing and deployment process, including how
9+
to:
10+
11+
* Update the E3SM Spack fork to support new versions
12+
* Maintain and release new versions of `mache` for system-specific Spack
13+
configurations
14+
* Deploy RCs and full releases of E3SM-Unified on supported HPC platforms
15+
* Identify and resolve deployment issues
16+
17+
---
18+
19+
## Phased Deployment Strategy
20+
21+
Testing typically begins with a **partial deployment** of an E3SM-Unified RC
22+
to a few key HPC systems. Once core functionality and package compatibility
23+
are verified, a **full deployment** to all supported machines is performed.
24+
25+
Each iteration involves collaboration between the Infrastructure Team and tool
26+
maintainers to:
27+
28+
* Validate that tools like `zppy`, `e3sm_diags`, and `mpas-analysis` run
29+
correctly
30+
* Confirm compatibility with system MPI, compilers, and Python versions
31+
* Identify mismatches or conflicts in environment resolution
32+
33+
---
34+
35+
## Key Components of the Deployment Process
36+
37+
The following steps and infrastructure are used when testing and deploying a
38+
new release:
39+
40+
### 🛠️ [Updating the E3SM Spack Fork](spack-updates.md)
41+
42+
* Add new versions of performance-critical tools (e.g., NCO, ESMF, MOAB)
43+
* Create `spack_for_mache_<version>` branches for use in `mache`
44+
45+
### 🧩 [Updating `mache`](mache-updates.md)
46+
47+
* Keep system-specific Spack environment templates in sync with E3SM module
48+
stacks
49+
* Create RC and final releases of `mache`
50+
* Use `utils/update_cime_machine_config.py` to streamline updates
51+
52+
### 🚀 [Deploying on HPCs](deploying-on-hpcs.md)
53+
54+
* Use the `deploy_e3sm_unified.py` script and template infrastructure in
55+
`e3sm_supported_machines`
56+
* Build environments and activation scripts tailored to each system
57+
58+
### 🧪 [Troubleshooting Deployment Issues](troubleshooting-deploy.md)
59+
60+
* Resolve Spack build failures and MPI/compiler mismatches
61+
* Address problems with activation, modules, or symbolic links
62+
* Common pitfalls in `default.cfg` or `shared.py` configuration
63+
64+
---
65+
66+
## Audience
67+
68+
This section is primarily intended for E3SM-Unified maintainers and release
69+
engineers. Familiarity with Spack, Conda, and HPC system environments is
70+
assumed.
71+
72+
➡ Start with: [Updating the E3SM Spack Fork](spack-updates.md)
Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
# Updating the E3SM Spack Fork
2+
3+
E3SM-Unified relies on a custom fork of Spack to build performance-critical
4+
software components that are not managed by Conda. This fork includes
5+
specialized packages (e.g., `moab`, `tempestremap`, `esmf`) and system-aware
6+
configurations to support a wide range of HPC environments.
7+
8+
This page outlines the steps for updating and managing the E3SM Spack fork
9+
during an E3SM-Unified release cycle.
10+
11+
---
12+
13+
## Repo Location
14+
15+
The E3SM Spack fork lives at:
16+
🔗 [https://github.com/E3SM-Project/spack](https://github.com/E3SM-Project/spack)
17+
18+
---
19+
20+
## Key Tasks
21+
22+
### 1. Add or Update Package Versions
23+
24+
You may need to:
25+
26+
* Add new versions of packages like `nco`, `moab`, `esmf`, `tempestremap`, etc.
27+
* Update build configurations, variants, or patches
28+
* rebase onto new releases of the main [spack repo](https://github.com/spack/spack)
29+
30+
Follow Spack’s standard packaging conventions. Builds will typically be tested
31+
as part of E3SM-Unified deployment (or deployment of Polaris or Compass), so
32+
no other testing is typically necessary or practical.
33+
34+
After changes are validated, push them to the appropriate branch or branches
35+
(see next section).
36+
37+
---
38+
39+
### 2. Create `spack_for_mache_<version>` Branches
40+
41+
The main development branch on E3SM's spack for is `develop`. Each release of
42+
`mache` also references a specific Spack branch named:
43+
44+
```
45+
spack_for_mache_<version>
46+
```
47+
48+
Example:
49+
50+
```
51+
spack_for_mache_1.32.0
52+
```
53+
54+
To create one from a local clone of the E3SM spack repo:
55+
56+
```bash
57+
git checkout develop
58+
git checkout -b spack_for_mache_1.32.0
59+
git push origin spack_for_mache_1.32.0
60+
```
61+
This ensures that the version of `mache` used for deployment has a stable and
62+
reproducible Spack reference. During development of a `mache` version, this
63+
also let you make potentially breaking changes to `spack_for_mache_<version>`
64+
for testing without breaking the `develop` branch. (Make sure to always push
65+
your changes to `origin` so they are available during E3SM-Unified deployment.)
66+
67+
**Note**: Your `spack_for_mache_<version>` branch name should not include
68+
`rc<n>` even if you are testing a release candidate of `mache` as part of your
69+
E3SM-Unified deployment. The deployment scripts automatically strip off the
70+
`rc<n>` part when determining the name of the appropriate spack branch.
71+
72+
Once you have a relatively stable `spack_for_mache_<version>` branch, you can
73+
push the changes you have made to `develop` so they are available for future
74+
`mache` versions and other users of E3SM's spack fork.
75+
76+
```bash
77+
git checkout develop
78+
git reset --hard spack_for_mache_1.32.0
79+
git push origin develop
80+
```
81+
Please be careful not to use `git push --force` here. You should only be
82+
adding new commits, not changing the history of `develop`.
83+
84+
### 3. Rebasing `develop` onto Spack Releases
85+
86+
One important maintenance task for the E3SM Spack fork is to keep it up-to-date
87+
with the [main Spack repo](https://github.com/spack/spack). This requires
88+
interactively rebasing the `develop` branch onto the release, interactively
89+
selecting only commits authored within the E3SM Spack fork (i.e., excluding
90+
upstream Spack commits), and troubleshooting any merge conflicts that arise.
91+
92+
Because this will involve a force-push, it is important to coordinate with
93+
other users of the fork. Make an issue similar to
94+
[this exampe](https://github.com/E3SM-Project/spack/issues/36) and ping
95+
relevant developers to arrange a good time for the update.
96+
97+
```bash
98+
git checkout develop
99+
git remote add spack/spack [email protected]:spack/spack.git
100+
git fetch --all -p
101+
git rebase -i spack/spack/v0.23.1
102+
# edit the list of commits so the first is "Add v2.1.0 to v2.1.6 to TempestRemap"
103+
git push --force origin develop
104+
```
105+
106+
You may wish to perform the rebase using a new branch (e.g.,
107+
`rebase-onto-v0.23.1`) that you can point to in the issue you post to
108+
coordinate with other developers. This way, you can ask for guidance if you
109+
are unsure about the way you resolved any merge conflicts that arose.
110+
111+
---
112+
113+
## Best Practices
114+
115+
* Keep `develop` clean and stable — avoid experimental changes
116+
* Use branches to track specific `mache` releases
117+
* Coordinate with other E3SM package maintainers when rebasing the `develop`
118+
branch or updating shared packages
119+
120+
---
121+
122+
➡ Next: [Updating `mache`](mache-updates.md)

0 commit comments

Comments
 (0)