This repository contains a version of the Northwest River Forecast Center (NWRFC) autocalibration tool for parameterizing the National Weather Service River Forecast System (NWSRFS) models using an evolving dynamically dimensioned search (EDDS). NWSRFS, originally developed in the late 1970s, remains a core component of the NWS Community Hydrologic Prediction System (CHPS). This framework supports simultaneous calibration of a suite of NWSRFS models across multiple zones, including: SAC-SMA, SNOW17, Unit Hydrograph, LAGK, CHANLOSS, and CONS_USE. See the NWSRFS documentation for more detail on each individual model.
Language: R
Package Dependency: nwrfc-hydro R package
-
Install R.
-
Install these R packages:
install.packages(c('xfun','import','devtools')) -
Install the
rfchydromodelsR package which requires a Fortran complier. This package has been tested with gfortran. See here for an easy option on MacOS.
From R:
devtools::install_github('NOAA-NWRFC/nwsrfs-hydro-models',subdir='rfchydromodels')or from the command line:
git clone https://github.com/NOAA-NWRFC/nwsrfs-hydro-models.git
cd nwsrfs-hydro-models
R CMD INSTALL rfchydromodels- The autocalibration scripts will try to install a number of R packages when run. If this fails you may need to install the packages manually.
NOTES:
-
Due to its use of fork-based parallelism, the tool is not compatible with Windows systems.
-
The code has been tested only with a 6-hour timestep. Use with other timesteps may require additional configuration and validation.
There are five basin directories included in this repo that serve as examples which utilize all the features of the auto calibration tool.
Example Basins
| NWSLI ID | Name | USGS # | Zones | Description |
|---|---|---|---|---|
| FSSO3 | Nehalem at Foss, OR | 14301000 | 1 | Rain-dominated (CAMELS) |
| SAKW1 | Sauk nr Sauk, WA | 12189500 | 2 | Rain/Snow-dominated, LAGK example (CAMELS) |
| SFLN2 | Salmon Falls nr San Jacinto, NV | 13105000 | 2 | Arrid basin, CONS_USE and CHANLOSS example |
| WCHW1 | Sauk ab White Chuck, WA | 12186000 | 2 | Rain/Snow-dominated, routing reach to SAKW1 (CAMELS) |
| WGCM8 | MF Flathead nr W Glacier, MT | 12358500 | 2 | Snow-dominated (CAMELS) |
*supporting files are stored in the runs/ directory
We recommend that you complete at least 4 cross validation runs in addition to the full period of record run to evaluate the calibration for any potential issues.
# period of record run
./run-controller.R --dir runs/2zone --objfun lognse_kge --basin WGCM8
# cross validation
./run-controller.R --dir runs/2zone --objfun lognse_kge --basin WGCM8 --cvfold 1
./run-controller.R --dir runs/2zone --objfun lognse_kge --basin WGCM8 --cvfold 2
./run-controller.R --dir runs/2zone --objfun lognse_kge --basin WGCM8 --cvfold 3
./run-controller.R --dir runs/2zone --objfun lognse_kge --basin WGCM8 --cvfold 4
# postprocessing
./postprocess.R --dir runs/2zone --basins WGCM8
# cross-validation
./cv_plots.R --dir runs/2zone --basins WGCM8
Any of the example basins could be swapped into this same workflow.
Refer to the example basins in the runs/ directory for the expected directory structure and file formats.
[LID]/
├── flow_daily_[LID].csv # Daily average flow observations (optional)
├── flow_instantaneous_[LID].csv # Instantaneous flow observations (optional)
├── forcing_por_[LID]-[zone #].csv # Forcing data for each zone (MAP, MAT, PTPS)
├── pars_default.csv # Default parameter file (-99 indicates the parameter will be optimized)
├── pars_limits.csv # Upper/lower limits for parameters that are optimized
├── [optional files...]
Optional Files:
forcing_validation_cv_[fold #]_[LID]-[zone #].csv: Forcing data for cross-validation folds. Note that the data for each cross validation fold must be created manually by subsetting your data, but any number of folds is possible. as long as they split the data into even groups.upflow_[RR LID].csv: Upstream flow data for routing reach (LAGK model). A reach may have more than one routed upstream flow input.
Notes:
LID: 5-character basin ID (e.g.,FSSO3). Note that this is an arbitrary basin identification code, you may swap in any unique 5 character alphanumeric identifier.zone #: Numeric zone ID (at least one required)fold #: Numeric ID for cross-validation fold, starting at 1.RR LID: Upstream reach LID (e.g.,WCHW1for LAGK optimization)- Need at least one daily or instanteous flow file for autocalibration
The run-controller.R script is run to create an optimized parameter file (pars_optimal.csv).
usage: run-controller.R [--] [--help] [--por] [--overwrite] [--lite]
[--dir DIR] [--basin BASIN] [--objfun OBJFUN] [--optimizer
OPTIMIZER] [--cvfold CVFOLD] [--num_cores NUM_CORES]
Auto-calibration run controller
flags:
-h, --help show this help message and exit
-p, --por Do a period of record run [default]
-ov, --overwrite Don't create new results dir, overwrite the first exising one
-l, --lite Testing run with 1/2 the total optimizer iteration
optional arguments:
-d, --dir Input directory path
-b, --basin Basin name
-o, --objfun Objective function name [default: nselognse_NULL]
--optimizer Optimzer to use {edds [default],pso,dds} [default:
edds]
-c, --cvfold CV fold to run (integer 1-4) [default: none]
-n, --num_cores Number of cores to allocate for run, FULL uses all
available cores -2 [default: FULL]
Example:
./run-controller.R --dir runs/1zone --objfun kge_NULL --basin FSSO3Notes:
- The script can only calibrate one basin at a time, although multiple can be done manually by utilizing only a portion of avaiable cores (
--num_cores #) and running multiple scripts simultaneously (i.e. poor man's parallelization). - Multiple runs can use the same input directory; results are placed in squentially numbered output directories,
results_por_01,results_por_02, etc. - For cross validation runs: use
--cvfold [#]. The cross validation fold number must have a corresponding forcing file in the basin directory. See Required Directory Structure section for more information on the cross validation forcing file. - Light run (half the number of optimizer iterations): use
--lite. Note that this will take less real time to run but may result in a lower quality parameter set. - To overwrite last results directory, i.e. don't increment the output directory: use
--overwrite(ignored if no results exist). - To control number of utilized CPU cores:
--cores [#]or--cores FULL(uses all available minus 2). - The number of iterations is set to 5000 (or 2500 for a lite run) and is intentionally not user editiable based on extensive testing of how many iterations will produce stable parameter sets. Note that it is still important to manage equally viable parameter sets (i.e. equifinal solutions) by setting appropriate ranges for optimized parameter sets. This is particularly important for multi zone basins.
- Increasing the number of cores used will not speed up the calibration but may make the calibration converge faster or come to a better overall solution.
- The optimizer supports calibration with daily average, instantaneous, or both flow types.
Objective Function Naming Convention:
Format: <daily_metric>_<instant_metric>
Default: nselognse_NULL
Available Objective Functions:
| <daily_metric>_<instant_metric> | Additional Options |
|---|---|
| nselognse_NULL | lognse_r2 |
| nsenpbias_NULL | kgelf_kge |
| kge_NULL | lognse_npbias95th |
| NULL_nselognse | kge_npbias2000q |
| NULL_kge | npbias050607m_kge |
| lognse_nse | nse.25wlognse.75w_npbias99th |
| lognse_kge | lognse.4W_nse1112010203m.6W |
To create a custom objective function, edit the obj_fun.R file and add your own function.
Notes:
- Do not include
_objportion of function name in argument. - Functions must accept
results_daily,results_instas inputs. - Selection of objective function should consider availability of daily and instanteous flow observations.
- Errors in custom functions should produce descriptive messages when running
run-controller.Rstarting with
"Objective Function had the following error, exiting:" - The obj_fun.R file includes additional guidance.
Once run-controller.R has been ran and the pars_optimal.csv file has been created in a run directory, postprocess.R can be run to create simulation timeseries csv files and other supporting tables. Data files are output into <dir>/<basin>/<run> and plots are output into <dir>/<basin>/<run>/plots.
usage: postprocess.R [--] [--help] [--dir DIR] [--basins BASINS]
[--run RUN]
Auto-calibration postprocessor
flags:
-h, --help show this help message and exit
optional arguments:
-d, --dir Input directory path containing basin directories
-b, --basins Basins to run
-r, --run Output directories to postprocess
Example:
./postprocess.R --dir runs/2zone --basins SFLN2 --run results_cv_3_01Notes:
- The script processes all completed runs for all basins in the
--dirpath by default unless--basinsor--runis specified.
cv-plots.R, generates plots comparing CV metrics with results from a stationary bootstrap of the POR run. Plots are output into <dir>/cv_plots.
usage: cv-plot.R [--] [--help] [--cleanup] [--dir DIR]
[--basins BASINS]
Creates CV Plots
flags:
-h, --help show this help message and exit
-c, --cleanup Option to delete results directories which are not
used for CV analysis
optional arguments:
-d, --dir Input directory path containing basin directories
-b, --basins Basins to run
Example:
./cv-plots.R --dir runs/2zone --basins WGCM8 SAKW1Notes:
- When multiple POR or CV runs exist, the runs with the highest KGE score are used to develop the CV plots.
- If the
--basinsargument is omitted then it runs all available basins. - Bootstrapping draws
x-year samples from POR (wherex= average CV fold length).- 8,000 bootstrap iterations performed.
Use --help to view argument options:
./run-controller.R --help
./postprocess.R --help
./cv-plots.R --help- Multiple CHANLOSS models can exist per basin.
- Overlapping time ranges are averaged.
- Start/end months can span across years (e.g., Nov–Feb = 11–2).
cl_typemust be 1 or 2:- 1: VARP adjustment (multiplier on simulation)
- 2: VARC adjustment (subtracted from simulation)
Each zone requires:
- MAP: Mean Areal Precipitation
- MAT: Mean Areal Temperature
- PTPS: % Precipitation as Snow
Note: Forcing data is end of timestep; simulated flow is start of timestep. Observational time series may need shifting forward/back one timestep to comply with this requirement.
For LAGK calibration, upstream flows are derived using AdjustQ.
See nwrfc-hydro R package for equivalent Python code.
In the NWRFC autocalibration scheme, mid-month climatological adjustment factors are optimized independently for each forcing variable—precipitation, temperature, precipitation typing, and potential evaporation. To disable climatological corrections for MAT, MAP, PTPS, or PET:
- Remove any related lines in
pars_limits.csv(e.g.,mat_shift_[LID]) - Set the following in pars_default.csv for each variable:
[forcing]_scale = 1
[forcing]_p_redist = 0
[forcing]_std = 10
[forcing]_shift = 0
Please cite the following work when using this tool:
Walters, G., Bracken, C., et al., "A comprehensive calibration framework for the Northwest River Forecast Center." Unpublished manuscript, Submitted 2025, Preprint
If adapting this code, please credit this repository as the original source.
The traditional dynamically dimensioned search (DDS) algorithm builds on original code by David Kneis ([email protected]). See: dds.r GitHub
This is a scientific product and does not represent official communication from NOAA or the U.S. Department of Commerce. All code is provided "as is."
See full disclaimer: NOAA GitHub Policy
National Oceanographic and Atmospheric Administration | National Weather Service | Northwest River Forecast Center
