Run parameter sweeps and Monte Carlo dispersions over GMAT missions in parallel from Python.
A parallel orchestrator on top of gmat-run's
single-run primitive. Point gmat-sweep at a working .script and either a parameter
grid, an explicit run table, or a perturbation distribution, and it fans the run set
across subprocess workers, aggregates each run's ReportFile (and any EphemerisFile
or ContactLocator outputs) into multi-indexed pandas DataFrames, and writes a JSON
Lines manifest alongside the results so any sweep is reproducible bit-for-bit. Killed
sweeps reload from the manifest and re-run only the missing or failed runs.
The four entry points cover the common shapes:
sweep(grid=...)— full-factorial grid over one or more dotted-path fields.sweep(samples=DataFrame)— explicit-row sweep where you pre-build the run set (Halton, Sobol, custom design).monte_carlo(perturb=...)— stochastic dispersion with named distributions and a deterministic seed contract.latin_hypercube(perturb=...)— stratified sampling for variance reduction at smalln.
- Not a single-run runner — that's
gmat-run; everygmat-sweepworker calls into it. - Not a way to build GMAT missions from scratch in Python — see
gmatpyplus. - Not a
.scripttext generator — seepygmat. - Not an optimiser. Gradient-, Bayesian-, and population-based optimisation
(CasADi, pagmo2, scikit-optimize) is a different problem;
gmat-sweepmay serve as the parallel evaluator inside one, but it ships no optimiser of its own. - Not a workflow engine.
gmat-sweepruns homogeneous parametric sweeps of one mission; Snakemake / Nextflow / Hamilton manage DAGs of heterogeneous tasks. A workflow engine can schedule agmat-sweepstep; the converse is not interesting.
- Python 3.10, 3.11, or 3.12.
gmat-run≥ 0.3 — installed as a transitive dependency from PyPI.gmat-sweepnever importsgmatpydirectly; the import happens inside each worker subprocess on first call.- A local GMAT install.
gmat-sweepdoes not ship GMAT binaries; it relies ongmat-run's install discovery, which honours$GMAT_ROOTor finds a build under a conventional path. Download GMAT from the SourceForge release page — seegmat-run's install guide for the unpack-and-discover steps.
| GMAT release | Status | CI |
|---|---|---|
| R2026a | Primary development target | Exercised on every PR (Ubuntu + Windows + macOS, Python 3.10/3.11/3.12) |
| R2025a | Supported | Exercised on every PR (Ubuntu + Windows + macOS, Python 3.10/3.11/3.12) |
R2023a and R2024a were never released by the upstream GMAT project; R2025a and R2026a are the only releases supported.
pip install gmat-sweepThe [examples] extra pulls in matplotlib for the example notebooks:
pip install gmat-sweep[examples]from gmat_sweep import LocalJoblibPool, sweep
df = sweep(
"mission.script",
grid={"Sat.SMA": [7000, 7100, 7200]},
backend=LocalJoblibPool(max_workers=8),
)
print(df)That call runs mission.script three times — once per Sat.SMA value — each in a fresh
subprocess, and returns a (run_id, time)-MultiIndexed pandas.DataFrame containing
the rows from every run's ReportFile plus a __status column flagging
ok / failed / skipped. A single failed run lands as a failed row with the captured
GMAT stderr in the manifest — never as a silent zero-row DataFrame and never as an
unhandled exception that aborts the whole sweep.
For a stochastic dispersion, swap sweep
for monte_carlo and pass a
perturb mapping of named distributions:
from gmat_sweep import LocalJoblibPool, monte_carlo
df = monte_carlo(
"mission.script",
n=1000,
perturb={"Sat.SMA": ("normal", 7100.0, 50.0)},
backend=LocalJoblibPool(max_workers=8),
seed=42,
)Returns the same DataFrame shape as sweep(). Per-run sub-seeds derive from seed via
numpy.random.SeedSequence.spawn, so the draw is bit-reproducible and a resumed sweep
samples the same values for any given run_id. See the
Monte Carlo guide for the full
determinism contract and latin_hypercube
for the stratified-sampling variant.
By default the per-run Parquet files and the manifest land in a temporary directory
whose lifetime is tied to the returned DataFrame. Pass out=Path(...) to keep them —
that's also what enables resuming a killed sweep
via Sweep.from_manifest(...).resume() or gmat-sweep resume <manifest>.
For multi-host sweeps, swap the local pool for DaskPool or RayPool — same
sweep() / monte_carlo() / latin_hypercube() call shape, different backend=:
from gmat_sweep import sweep
from gmat_sweep.backends import DaskPool
with DaskPool(n_workers=8) as pool:
df = sweep(
"mission.script",
grid={"Sat.SMA": [7000, 7100, 7200]},
backend=pool,
)DaskPool and RayPool ship behind pip install gmat-sweep[dask] /
gmat-sweep[ray]. See the backends page
for the full set of pool patterns and the
cluster recipes for
Slurm / Kubernetes / Ray autoscaling wiring.
A gmat-sweep console script is also installed for shell-script and CI use:
gmat-sweep run --grid Sat.SMA=7000:7200:3 --workers 8 --out ./sweep mission.script
gmat-sweep run --grid Sat.SMA=7000:7200:3 --backend dask --workers 8 --out ./sweep mission.script
gmat-sweep monte-carlo --n 1000 --perturb 'Sat.SMA=normal:7100:50' --seed 42 --out ./mc mission.script
gmat-sweep resume --script mission.script --workers 8 ./mc/manifest.jsonl
gmat-sweep show ./sweep/manifest.jsonl
gmat-sweep archive --out ./sweep.zip ./sweep/manifest.jsonlSee the CLI reference in the docs for every subcommand and the full mini-grammar.
Every sweep emits two artefacts:
- The returned DataFrame —
(run_id, time)-MultiIndexed, one column perReportFilechannel plus the__statuscolumn. Built lazily from per-run Parquet files via pyarrow's dataset API, so a 10,000-run sweep does not have to fit in memory at once. - A JSON Lines manifest (
manifest.jsonl) — append-only, fsync'd after every entry. Records the canonical script SHA-256, software-version fingerprint, full parameter spec, and per-run status, timing, output paths, and captured stderr. ACtrl-Cmid-sweep leaves the manifest in a parseable state. See the manifest schema for the full contract.
Full docs at https://astro-tools.github.io/gmat-sweep/, including a getting-started guide, the parameter spec reference, the manifest schema, the supported-version matrix, the FAQ, and the API reference.
Runnable example notebooks:
- Single-axis SMA scan —
fifty runs across
np.linspace(7000, 8000, 50)ofSat.SMA, parallel-dispatched and overlaid on a single altitude-vs-time plot. - Two-axis epoch × time-of-flight grid —
cartesian product over
Sat.Epochand a script-levelVariable TOF, contoured by per-run miss distance. - Surviving a kill —
launch a sweep, send
SIGINTmid-run, walk through inspecting the partial manifest withgmat-sweep show, then complete the sweep withSweep.from_manifest(...).resume(). - Monte Carlo dispersion — 1000-run Monte Carlo around a nominal injection burn over a four-axis perturbation cube, with arrival-miss histogram and a 3-σ covariance ellipse.
- Latin hypercube vs Monte Carlo — 64-run Latin hypercube alongside a 64-run plain Monte Carlo on the same perturbation, pair-plotting the unit-cube samples to make the stratification visible.
- Dask cluster recipe —
100-run
Sat.SMAgrid dispatched through adistributed.LocalClusterwithDaskPool, same flow as a realdask.distributedcluster. - Ray autoscaling recipe —
100-run Monte Carlo dispatched through
RayPoolagainst a localray.init(), same task model as a real autoscaling Ray cluster. - Sobol sensitivity —
Saltelli design via
sobol_sample, run throughsweep(samples=...), reduced to first/total-order Sobol indices viasobol_analyzewith 95 % bootstrap CIs. - Archive bundle —
pack a finished sweep into a self-describing
.zipviaSweep.archive(), inspect the layout, and re-aggregate the per-run DataFrame from the unzipped tree. - Extending a Monte Carlo —
anchor a 100-run
monte_carlo, append 200 more viamonte_carlo_extend(n=200), and assert that the original 100run_ids are preserved bit-for-bit.
To work on gmat-sweep itself:
git clone https://github.com/astro-tools/gmat-sweep.git
cd gmat-sweep
uv sync --all-groupsSee CONTRIBUTING.md for the full branch / PR / test workflow.
MIT. See LICENSE.