A workflow tool for weather-model verification, leveraging uwtools to drive MET.
- Download, install, and activate Miniforge, the conda-forge project's implementation of Miniconda. This step can be skipped if you already have a conda installation you want to use. Linux
aarch64andx86_64systems are currently supported. The example shown below is for a Linuxaarch64system, so if your system is Intel/AMD, download thex86_64installer.
wget https://github.com/conda-forge/miniforge/releases/download/25.3.0-3/Miniforge3-Linux-aarch64.sh
bash Miniforge3-Linux-aarch64.sh -bfp conda
rm Miniforge3-Linux-aarch64.sh
. conda/etc/profile.d/conda.sh
conda activateNOTE: If you need a development environment in which to develop and test wxvx code, skip the following step and refer to the Development section.
- Create and activate a conda virtual environment providing the latest
wxvx. (Add the flags-c conda-forge --override-channelsto theconda createcommand if you are using a non-conda-forge conda installation.)
conda create -y -n wxvx -c ufs-community -c paul.madden wxvx
conda activate wxvx
wxvx --versionThe activated virtual environment includes the met2go package, which provides MET, select METplus executables, and data files. See the met2go docs for more information.
An overview of the content of the YAML configuration file specified via -c / --config is described in the table below. See the subsections below for more detailed information.
┌────────────────────┬───────────────────────────────────────────┐
│ Key │ Description │
├────────────────────┼───────────────────────────────────────────┤
│ baseline: │ Description of the baseline dataset │
│ name: │ Name of the baseline model or 'truth' │
│ url: │ Template for baseline GRIB location │
│ cycles: │ Cycles to verify │
│ start: │ First cycle │
│ step: │ Interval between cycles │
│ stop: │ Last cycle │
│ forecast: │ Description of the forecast dataset │
│ coords: │ Names of coordinate variables │
│ latitude: │ Latitude variable │
│ level: │ Level variable │
│ longitude: │ Longitude variable │
│ time: │ Names of time variables │
│ inittime: │ Forecast initialization time │
│ leadtime: │ Forecast leadtime │
│ validtime: │ Forecast validtime │
│ mask: │ Sequence of [lat, lon] pairs (optional) │
│ name: │ Dataset descriptive name │
│ path: │ Filesystem path to Zarr/netCDF dataset │
│ projection: │ Projection information (optional) │
│ leadtimes: │ Leadtimes to verify │
│ start: │ First leadtime │
│ step: │ Interval between leadtimes │
│ stop: │ Last leadtime │
│ meta: │ Optional free-form data section │
│ paths: │ Paths │
│ grids: │ Where to store... │
│ baseline: │ Baseline grids │
│ forecast: │ Forecast grids │
│ truth: │ Truth grids │
│ obs: │ Where to store observations │
│ run: │ Where to store run data │
│ regrid: │ MET regrid options │
│ method: │ Regridding method │
│ to: │ Destination grid │
│ truth: │ Description of the truth dataset │
│ name: │ Dataset descriptive name │
│ type: │ Either 'grid' or 'point' │
│ url: │ Template for grid or point truth files │
│ variables: │ Mapping describing variables to verify │
│ VAR: │ Forecast-dataset variable name │
│ level_type: │ Generic level type │
│ levels: │ Sequence of level values │
│ name: │ Canonical variable name │
└────────────────────┴───────────────────────────────────────────┘
Describes a baseline dataset to be verified, alongside forecasts from the experimental model, against the truth dataset.
The name of a GRIB-formatted, wxvx-recognized model (currently GFS or HRRR) to treat as a baseline, or the reserved name truth to specify that the same GRIB-formatted dataset described by the truth: block should be used.
Specifies the location of baseline data. Values may be local-filesystem paths (optionally prefixed with file://) or HTTP URLs prefixed with http:// or https:// and may contain Jinja2 expressions. This value must be omitted when baseline.name is truth, and must be supplied otherwise.
The start and stop values should be in optionally quoted ISO8601 year/month/date/hour/minute/second form, e.g. 2025-06-03T12:00:00. The step value should be either an int specifying the number of hours, or a quoted string of the form hours[:minutes[:seconds]] specifying hours and, optionally, minutes and seconds.
When using start / step / stop syntax, the final cycle is included in verification. That is, the range is inclusive of its upper bound.
Alternatively, the cycles to verify may be specified as an arbitrary list of ISO8601-formatted values, e.g.
cycles:
- 2025-06-01T06:00:00
- 2025-06-02T12:00:00
- 2025-06-03T18:00:00or
cycles: [2025-06-01T06:00:00, 2025-06-02T12:00:00, 2025-06-03T18:00:00]Specify values under forecast.coords.time as follows:
inittime: The name of the variable or attribute providing the forecast initialization time, aka cycle or, per CF Conventions, forecast reference time. Required.leadtime: The name of the variable or attribute providing the forecast leadtime. Exactly one ofleadtimeandvalidtimemust be specified.validtime: The name of the variable or attribute providing the forecast validtime. Exactly one ofvalidtimeandleadtimemust be specified.
If a variable specified under forecast.coords.time names a coordinate dimension variable, that variable will be used. If no such variable exists, wxvx will look for a dataset attribute with the given name and try to use it, coercing it to the expected type (e.g. datetime or timedelta) as needed. For example, it will parse an ISO8601-formatted string to a Python datetime object.
The forecast.mask value may be omitted, or set to the YAML value null, in which case no masking will be applied.
An arbitrary value identifying the forecast model being verified. This name will appear in MET stat output. This name must differ from truth.name.
Specifies the location of forecast data. Values must be local-filesystem paths and may contain Jinja2 expressions.
Each value should be either an int specifying the number of hours, or a quoted string of the form hours[:minutes[:seconds]] specifying hours and, optionally, minutes and seconds. (See this post for a discussion on why these values must be quoted.)
When using start / step / stop syntax, the final leadtime is included in verification. That is, the range is inclusive of its upper bound.
Alternatively, the leadtimes to verify may be specified as an arbitrary list of values, e.g.
leadtimes:
- 03:00:00
- 06:00:00
- 09:00:00or
leadtimes: [3, 6, 9]The meta: block may contain, for example, values tagged with YAML anchors referenced elsewhere via aliases (see the Aliases section here), or values referenced elsewhere in Jinja2 expressions to be rendered by uwtools (see examples in here).
Defines paths where various runtime assets are to be written.
Where to store grids extracted from GRIB baseline datasets. When baseline.name is truth, this value will be ignored (paths.grids.truth will be used instead), and a warning will be issued if it is set.
Where to store grids extracted from netCDF or Zarr forecast datasets. GRIB forecast datasets currently must be local, and grids are processed directly from their containing files without being extracted into separate files.
Where to store grids extracted from GRIB truth datasets.
Where to store raw prepbufr files when they must be downloaded, and where to store netCDF files created from prepbufr sources with the MET tool pb2nc.
Where to store run directories for various wxvx workflow tasks: run directories for MET programs pb2nc, grid_stat, and point_stat, as well as those for plots.
The forecast.projection block is optional, and defaults to a latlon projection when not specified. When provided, the forecast.projection value should be a mapping with at least a proj key identifying the ID of the projection, and potentially additional projection attributes depending on the proj value:
- When
projislatlon, specify no additional attributes. - When
projislcc, specify attributesa,b,lat_0,lat_1,lat_2, andlon_0.
Options are listed here (default: NEAREST).
Regrid grids and observations to the specified grid. Options are truth, forecast, or a GNNN grid ID (default: forecast). Option truth must not be used when truth.type is point.
Name of the truth to verify against. Currently supported values are: GFS, HRRR, PREPBUFR. This value guides wxvx in identification of truth data (grids or obs) corresponding to the forecast variable being verified. This name will also appear in MET stat output. This name must differ from forecast.name.
One of grid or point.
-
For
grid,urlshould point to GRIB data, andnameshould be one of:GFS,HRRR. -
For
point,urlshould point to prepbufr data,nameshould bePREPBUFR, andregrid.tomust not betruth.
Specifies the location of truth data. Values may be local-filesystem paths (optionally prefixed with file://) or HTTP URLs prefixed with http:// or https:// and may contain Jinja2 expressions.
The variables: block is a mapping from forecast-dataset variable names (keys) to generic descriptions of the named variables (values). Generic-description attributes (names and level types) follow ECMWF conventions: See the Parameter Database for names, and this list or the output of grib_ls run on a GRIB file containing the variable in question, for level types.
For netCDF- or Zarr-formatted forecast datasets, each key is used to select the correct dataset variable, so must correspond to a variable name in the dataset. For GRIB-formatted forecast datasets, variable selection is done only using the generic description given by each key's value, so the keys must be unique but are otherwise arbitrary.
Currently supported level types are: atmosphere, heightAboveGround, isobaricInhPa, surface.
A levels: value should only be specified if a level type supports it. Currently, these are: heightAboveGround, isobaricInhPa.
Some values may include Jinja2 expressions, processed at run-time with jinja2.Template.render():
- Variables
yyyymmdd(cycle date, astr),hh(cycle hour, astr), andfh(forecast hour, aka leadtime, anint) -- as well ascycle(adatetime) andleadtime(atimedelta), useful for computing arbitrary time strings -- will be supplied bywxvxfor use in expressions. - The syntax
'{{ "VAR" | env }}'can be used to insert the value of theVARenvironment variable, using thewxvx-suppliedenvJinja2 filter. - Values defined in the user-specified
meta:section (or any other section in the config) can be accessed with e.g.'{{ meta.foo.bar }}'to insert the value of keybarnested under keysmetaandfoo.
$ wxvx --help
usage: wxvx [-c FILE] [-t TASK] [-d] [-h] [-k] [-l] [-n N] [-s] [-v]
wxvx
Required arguments:
-c, --config FILE
Configuration file
-t, --task TASK
Task to execute
Optional arguments:
-d, --debug
Log all messages
-h, --help
Show help and exit
-k, --check
Check config and exit
-l, --list
List available tasks and exit
-n, --threads N
Number of threads
-s, --show
Show the realized config and exit
-v, --version
Show version and exit
Consider a config.yaml
baseline:
name: truth
cycles:
start: 2025-03-01T00:00:00
step: 1
stop: 2025-03-01T23:00:00
forecast:
coords:
latitude: latitude
level: level
longitude: longitude
time:
inittime: time
leadtime: lead_time
mask:
- [52.61564933, 225.90452027]
- [52.61564933, 299.08280723]
- [21.138123, 299.08280723]
- [21.138123, 225.90452027]
name: ML
path: /path/to/forecast.zarr
projection:
a: 6371229
b: 6371229
lat_0: 38.5
lat_1: 38.5
lat_2: 38.5
lon_0: 262.5
proj: lcc
leadtimes: [3, 6, 9]
meta:
grids: "{{ meta.workdir }}/grids"
levels: &levels [800, 1000]
workdir: /path/to/workdir
paths:
grids:
forecast: "{{ meta.grids }}/forecast"
truth: "{{ meta.grids }}/truth"
run: "{{ meta.workdir }}/run"
truth:
name: HRRR
type: grid
url: https://noaa-hrrr-bdp-pds.s3.amazonaws.com/hrrr.{{ yyyymmdd }}/conus/hrrr.t{{ hh }}z.wrfprsf{{ "%02d" % fh }}.grib2
variables:
HGT:
level_type: isobaricInhPa
levels: *levels
name: gh
REFC:
level_type: atmosphere
name: refc
T2M:
level_type: heightAboveGround
levels: [2]
name: 2tThis config directs wxvx to find forecast data, in Zarr format and on a Lambert Conformal grid with the given specification, under /path/to/forecast.zarr.
In the forecast dataset, coordinate variables (or attributes) representing latitude, longitude, and vertical level will be named, unsurprisingly, latitude, longitude, and level. Forecast validtime is represented by a combination of initialization time and leadtime, and coordinate variables (or attributes) called time and lead_time, respectively, provide these values. (Alternatively, a forecast dataset might provide validtime instead of leadtime, with the wxvx YAML key validtime used instead of leadtime.)
Verification will be limited to points within the bounding box given by mask.
The forecast will be called ML in MET .stat files and in plots.
It will be verified against HRRR analysis as truth, which can be found in GRIB files located in an AWS bucket whose URL is given as the truth.url value, where yyyymmdd, hh, and fh will be filled in by wxvx. (The yyyymmdd and hh values are strings like 20250523 and 06, while fh is an int value to be formatted as needed.)
24 1-hourly cycles starting at 2025-03-01 00Z, each with forecast leadtimes 3, 6, and 9, will be verified.
Variable grids extracted from truth datasets will be written to /path/to/workdir/truth, forecast dataset to /path/to/workdir/forecast, and run output to /path/to/workdir/run. The Jinja2 expressions inside {{ }} markers will be processed by uwtools and may use any features it supports.
Three variables -- geopotential height, composite reflectivity, and 2-meter temperature, will be verified. The keys under variables map the names of the variables as they appear in the forecast dataset to a canonical description of the variable using ECMWF variable names and level-type descriptions (see the notes in the Configuration section for links). (Note that some variables do not support a "level" concept.) The full verification task-graph will comprise: cycles x leadtimes x variables x levels.
Finally, because baseline.name is set to truth, HRRR forecasts with validtimes matching those of the ML model's forecasts will be verified against HRRR analysis, producing MET statistics. If the plots task is requested, the ML and HRRR stats will be plotted together.
Invoking wxvx -c config.yaml -t grids_truth would stage the truth grids to disk, only; -t grids_forecast would stage the forecast grids; -t grids would stage both. Specifying -t stats would produce statistics via MET tools, but also stage grids if they are not already available, since the grids are required by the MET processes. Specifying -t plots would plot statistics, but also produce statistics (and stage grids) if they are not already available.
When wxvx extracts grids from the forecast dataset and writes them to netCDF files to be processed by MET, it decorates them with certain CF Metadata as required by MET. See this database for CF standard names and units.
- In the
baseenvironment of a Miniforge installation, install thecondevpackage. (Add the flags-c conda-forge --override-channelsto theconda createcommand if using a non-conda-forge conda installation.)
conda install -y -c maddenp condev- In the root directory of a
wxvxgit clone:
make devshellThis will create and activate a conda virtual environment named DEV-wxvx, where all build, run, and test requirement packages are available, and the code under src/wxvx/ is live-linked into the environment, such that code changes are immediately live and testable. Several make targets are available: make format formats Python code and JSON documents, and make test runs all code-quality checks (equivalent to make lint && make typecheck && make unittest, targets which can also be run independently). A common development idiom is to periodically run make format && make test.
When you are finished, type exit to return to your previous shell. The DEV-wxvx environment will still exist, and a future make devshell command will more-or-less instantly activate it again.
To update the version or build number:
- Edit
src/wxvx/resources/info.jsonappropriately. When updating the version number, reset the build number to0. - In a
baseconda environment with thecondevpackage installed (see above), runmake meta. This will update therecipe/meta.*files. - Commit these changes.
conda install -y pygrib
python -c "import pygrib; print(pygrib.open('a.grib2').message(1).projparams)"