Skip to content

Commit 5759ce0

Browse files
authored
Merge pull request #113 from falconstryker/devel
Minor paper formatting changes
2 parents 13d1239 + 61ac725 commit 5759ce0

2 files changed

Lines changed: 53 additions & 55 deletions

File tree

.github/workflows/marsfiles_test.yml

Lines changed: 2 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -26,13 +26,8 @@ jobs:
2626
test:
2727
strategy:
2828
matrix:
29-
include:
30-
- os: ubuntu-latest
31-
python-version: '3.11'
32-
- os: macos-latest
33-
python-version: '3.11'
34-
- os: windows-latest
35-
python-version: '3.11'
29+
os: [ubuntu-latest, macos-latest, windows-latest]
30+
python-version: ['3.9', '3.10', '3.11']
3631
fail-fast: false
3732
runs-on: ${{ matrix.os }}
3833
steps:

docs/paper.md

Lines changed: 51 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -1,57 +1,60 @@
1-
---
2-
title: 'Community Analysis Pipeline: A Python package for processing Mars climate model data'
3-
tags:
4-
- Python
5-
- astronomy
6-
- Mars global climate model
7-
- data processing
8-
- data visualization
9-
authors:
10-
- name: Alexandre M. Kling
1+
---
2+
title: 'Community Analysis Pipeline: A Python package for processing Mars climate model data'
3+
tags:
4+
- Python
5+
- astronomy
6+
- Mars global climate model
7+
- data processing
8+
- data visualization
9+
authors:
10+
- name: Alexandre M. Kling
1111
orcid: 0000-0002-2980-7743
12-
equal-contrib: true
13-
affiliation: 1
14-
corresponding: true
15-
- name: Courtney M. L. Batterson
16-
orcid: 0000-0001-5894-095X
17-
equal-contrib: true
18-
affiliation: 1
19-
- name: Richard A. Urata
20-
orcid: 0000-0001-8497-5718
21-
equal-contrib: true
22-
affiliation: 1
23-
- name: Victoria L. Hartwick
24-
orcid: 0000-0002-2082-8986
25-
equal-contrib: true
26-
affiliation: 3
27-
- name: Melinda A. Kahre
12+
equal-contrib: true
13+
affiliation: 1
14+
corresponding: true
15+
- name: Courtney M. L. Batterson
16+
orcid: 0000-0001-5894-095X
17+
equal-contrib: true
18+
affiliation: 1
19+
- name: Richard A. Urata
20+
orcid: 0000-0001-8497-5718
21+
equal-contrib: true
22+
affiliation: 1
23+
- name: Victoria L. Hartwick
24+
orcid: 0000-0002-2082-8986
25+
equal-contrib: true
26+
affiliation: 3
27+
- name: Melinda A. Kahre
2828
orcid: 0000-0002-0935-5532
29-
equal-contrib: true
30-
affiliation: 2
31-
affiliations:
32-
- name: Bay Area Environmental Research Institute, United States
33-
index: 1
34-
- name: NASA Ames Research Center, United States
35-
index: 2
36-
- name: Southwest Research Institute, United States
37-
index: 3
38-
date: 9 May 2025
29+
equal-contrib: true
30+
affiliation: 2
31+
affiliations:
32+
- name: Bay Area Environmental Research Institute, United States
33+
index: 1
34+
ror: 024tt5x58
35+
- name: NASA Ames Research Center, United States
36+
index: 2
37+
ror: 02acart68
38+
- name: Southwest Research Institute, United States
39+
index: 3
40+
ror: 03tghng59
41+
date: 9 May 2025
3942
bibliography: paper.bib
4043

4144
---
4245

4346
# Summary
4447

45-
The Community Analysis Pipeline (CAP) is a Python package designed to streamline and simplify the complex process of analyzing large datasets created by global climate models (GCMs). CAP consists of a suite of tools that manipulate NetCDF files in order to produce secondary datasets and figures useful for science and engineering applications. CAP also facilitates inter-model and model-observation comparisons, and it is the first software of its kind to standardize these comparisons. The goal is to enable users with varying levels of programming experience to work with complex data products from a variety of GCMs and thereby lower the barrier to entry for planetary science research.
48+
The Community Analysis Pipeline (CAP) is a Python package designed to streamline and simplify the complex process of analyzing large datasets created by global climate models (GCMs). CAP consists of a suite of tools that manipulate NetCDF files in order to produce secondary datasets and figures useful for science and engineering applications. CAP also facilitates inter-model and model-observation comparisons, and it is the first software of its kind to standardize these comparisons. The goal is to enable users with varying levels of programming experience to work with complex data products from a variety of GCMs and thereby lower the barrier to entry for planetary science research.
4649

4750
# Statement of need
4851

49-
GCMs perform numerical simulations that describe the evolution of climate systems on planetary bodies. GCMs simulate physical processes within the atmosphere (and, if applicable, within the surface of the planet, ocean, and any interactions therein), calculate radiative transfer within those mediums, and use a computational fluid dynamics (CFD) solver (the “dynamical core”) to predict the transport of heat and momentum within the atmosphere. Typical GCM products include surface and atmospheric variables such as wind, temperature, and aerosol concentrations. While GCMs have been applied to planetary bodies in our Solar System (e.g. Earth, Venus, Pluto) and in other stellar systems (e.g. [@Hartwick:2023]), CAP is currently compatible with Mars GCMs (MGCMs). Several MGCMs are actively in use and under development in the Mars community, including the NASA Ames MGCM (Legacy and FV3-based versions), NASA Goddard ROCKE-3D, the Laboratoire de Météorologie Dynamique (LMD) Mars Planetary Climate Model (PCM), the Open University OpenMars, NCAR MarsWRF, NCAR MarsCAM, GFDL Mars GCM, Harvard DRAMATIC Mars GCM, Max Planck Institute Mars GCM, and GEM-Mars. Of these, CAP is compatible with four models so far: the NASA Ames MGCM, PCM, OpenMars, and MarsWRF.
52+
GCMs perform numerical simulations that describe the evolution of climate systems on planetary bodies. GCMs simulate physical processes within the atmosphere (and, if applicable, within the surface of the planet, ocean, and any interactions therein), calculate radiative transfer within those mediums, and use a computational fluid dynamics (CFD) solver (the “dynamical core”) to predict the transport of heat and momentum within the atmosphere. Typical GCM products include surface and atmospheric variables such as wind, temperature, and aerosol concentrations. While GCMs have been applied to planetary bodies in our Solar System (e.g. Earth, Venus, Pluto) and in other stellar systems (e.g. [@Hartwick:2023]), CAP is currently compatible with Mars GCMs (MGCMs). Several MGCMs are actively in use and under development in the Mars community, including the NASA Ames MGCM (Legacy and FV3-based versions), NASA Goddard ROCKE-3D, the Laboratoire de Météorologie Dynamique (LMD) Mars Planetary Climate Model (PCM), the Open University OpenMars, NCAR MarsWRF, NCAR MarsCAM, GFDL Mars GCM, Harvard DRAMATIC Mars GCM, Max Planck Institute Mars GCM, and GEM-Mars. Of these, CAP is compatible with four models so far: the NASA Ames MGCM, PCM, OpenMars, and MarsWRF.
5053

5154
MGCM output is complex in both size and structure. Analyzing the output requires GCM-specific domain knowledge. We identify the following major challenges for working with MGCM output:
5255

53-
\* Files tend to be fairly complex in structure, with output fields represented by multiple variables (e.g. air vs surface temperature), varying units (e.g. Kelvin), complex dimensional structures (e.g. 2–5 dimensions), and a variety of sampling frequencies (e.g. temporally averaged or instantaneous) on different horizontal and vertical grids.
54-
\* File sizes typically range from \~10 Gb–10 Tb for simulations describing the Martian climate over a full orbit around the Sun (depending on the number of atmospheric fields being analyzed, time sampling, and the horizontal and vertical resolutions of the run). Large files require curated processing pipelines in order to manage memory storage. This can be particularly challenging for users that do not have access to academic or enterprise clusters or supercomputers for their analyses.
56+
\* Files tend to be fairly complex in structure, with output fields represented by multiple variables (e.g. air vs surface temperature), varying units (e.g. Kelvin), complex dimensional structures (e.g. 2–5 dimensions), and a variety of sampling frequencies (e.g. temporally averaged or instantaneous) on different horizontal and vertical grids.
57+
\* File sizes typically range from \~10 Gb–10 Tb for simulations describing the Martian climate over a full orbit around the Sun (depending on the number of atmospheric fields being analyzed, time sampling, and the horizontal and vertical resolutions of the run). Large files require curated processing pipelines in order to manage memory storage. This can be particularly challenging for users that do not have access to academic or enterprise clusters or supercomputers for their analyses.
5558
\* Domain-specific knowledge is required to derive secondary variables, manipulate complex data structures, and visualize results. Working with MGCM data is especially difficult for users unfamiliar with the fields commonly output by MGCMs or the mathematical methods used in climate science.
5659

5760
CAP offers a streamlined workflow for processing and analyzing MGCM data products by providing a set of libraries and executables that facilitate file manipulation and data visualization from the command-line. This benefits existing modelers by automating both routine and sophisticated post-processing tasks. It also expands access to MGCM products by removing some of the technical roadblocks associated with processing these complex data products.
@@ -62,43 +65,43 @@ CAP has been used in multiple research projects that have been published and/or
6265

6366
# Functionality
6467

65-
CAP consists of six command-line executables that can be used sequentially or individually to derive secondary data products, thus offering a high level of flexibility. A configuration text file is provided so that users can define the input file structure (e.g., variable names, longitudinal structure, and interpolation levels) and preferred plotting style (e.g., time axis units) for their analysis. The six executables in CAP are described below:
68+
CAP consists of six command-line executables that can be used sequentially or individually to derive secondary data products, thus offering a high level of flexibility. A configuration text file is provided so that users can define the input file structure (e.g., variable names, longitudinal structure, and interpolation levels) and preferred plotting style (e.g., time axis units) for their analysis. The six executables in CAP are described below:
6669

6770
## MarsPull
6871

69-
MarsPull is a data pipeline utility for downloading MGCM data products from the NAS Data Portal ([https://data.nas.nasa.gov/](https://data.nas.nasa.gov/)) . Recognizing that each member within the science and engineering community has their own requirements for hosting proprietary Mars climate datasets (e.g. institutional servers, Zenodo, GitHub, etc.), MarsPull is intended to be a mechanism for interfacing those datasets. MarsPull enables users to query data meeting a specific criteria, such as a date range (e.g., solar longitude), which allows users to parse repositories first and download only the necessary data, thus avoiding downloading entire repositories which can be large (\>\>15Gb). A typical application of MarsPull is:
72+
MarsPull is a data pipeline utility for downloading MGCM data products from the NAS Data Portal ([https://data.nas.nasa.gov/](https://data.nas.nasa.gov/)). Recognizing that each member within the science and engineering community has their own requirements for hosting proprietary Mars climate datasets (e.g. institutional servers, Zenodo, GitHub, etc.), MarsPull is intended to be a mechanism for interfacing those datasets. MarsPull enables users to query data meeting a specific criteria, such as a date range (e.g., solar longitude), which allows users to parse repositories first and download only the necessary data, thus avoiding downloading entire repositories which can be large (\>\>15Gb). A typical application of MarsPull is:
7073

71-
`> MarsPull directory_name -f MGCM_file1.nc MGCM_file2.nc`
74+
`MarsPull directory_name -f MGCM_file1.nc MGCM_file2.nc`
7275

7376
## MarsFormat
7477

7578
MarsFormat is a utility for converting non-NASA Ames MGCM products into NASA Ames-like MGCM products for compatibility with CAP. MarsFormat reorders dimensions, adds standardized coordinates that are expected by other executables for various computations (e.g., pressure interpolation), converts variable units to conform to the International System of Units (e.g., Pa for pressure), and reorganizes coordinate values as needed (e.g., reversing the vertical pressure array for plotting). Additional, model-specific operations are performed as necessary. For example, MarsWRF data requires un-staggering latitude-longitude grids and calculating absolute fields from perturbation fields. A typical application of MarsFormat is:
7679

77-
`> MarsFormat MGCM_file.nc -gcm model_name`
80+
`MarsFormat MGCM_file.nc -gcm model_name`
7881

7982
## MarsFiles
8083

8184
MarsFiles provides several tools for file manipulation such as file size reduction, temporal and spatial filtering, and splitting or concatenating data along specified dimensions. Operations performed by MarsFiles are applied to entire NetCDF files producing new data structures with amended file names. A typical application of MarsFiles is:
8285

83-
`> MarsFiles MGCM_file.nc -flags`
86+
`MarsFiles MGCM_file.nc -flags`
8487

8588
## MarsVars
8689

8790
MarsVars performs variable operations such as adding, removing, and editing variables and computing column integrations. It is standard practice within the modeling community to avoid outputting variables that can be derived outside of the MGCM in order to minimize file size. For example, atmospheric density (rho) is easily derived from temperature and pressure and therefore typically not included in output files. MarsVars derives rho from temperature and pressure and adds it to the file with a single command line argument. A typical application of MarsVars is:
8891

89-
`> MarsVars MGCM_file.nc –add rho`
92+
`MarsVars MGCM_file.nc –add rho`
9093

9194
## MarsInterp
9295

9396
MarsInterp interpolates the vertical coordinate to a standard grid: pressure, altitude, or altitude above ground level. Vertical grids vary considerably from model to model. Most MGCMs use a pressure or hybrid pressure vertical coordinate (e.g. terrain-following, pure pressure levels, or sigma levels) in which the geometric heights and mid-layer pressures of the atmospheric layers vary in latitude and longitude. It is therefore necessary to interpolate to a standard vertical grid in order to do any rigorous spatial averaging or inter-model or observation-to-model comparisons. A typical application of MarsInterp is:
9497

95-
`> MarsInterp MGCM_file.nc -t pstd`
98+
`MarsInterp MGCM_file.nc -t pstd`
9699

97100
## MarsPlot
98101

99102
MarsPlot is the plotting utility for CAP. It accepts a modifiable text template containing a list of plots to generate (Custom.in) as input and outputs graphics to PDF or PNG. It supports multiple types of 1-D or 2-D plots, color schemes, map projections, and can customize axes range, plot titles, or contour intervals. It also supports some simple math functions to derive secondary fields not supported by MarsVars. A typical application of MarsPlot is:
100103

101-
`> MarsPlot Custom.in`
104+
`MarsPlot Custom.in`
102105

103106
# Acknowledgements
104107

0 commit comments

Comments
 (0)