This repository contains tools used to preprocess and run long-short term memory (LSTM) models to reconstruct streamflow in ungaged locations upstream of Catchment Attributes and MEteorology for Large-sample Studies (CAMELS) basins. Using a known gage value (the CAMELS USGS gaged streamflow) as an input into an LSTM, as well as other meteorological and geographical inputs sourced from the National Water Model (NWM) 3.0 retrospective dataset, we use the LSTM outputs to reconstruct streamflow at upstream ungaged locations.
The contents of this repository are chronologically ordered as follows:
- Data Collection
- Link USGS/NWM
- Converts USGS gages into their corresponding NWM reach IDs. Use
link_usgs_to_nwm.ipynb.
- Converts USGS gages into their corresponding NWM reach IDs. Use
- Get Upstream Basins
- Collects list of every NWM reach and basin upstream of a CAMELS basin. Use
get_upstream_basins.ipynb.
- Collects list of every NWM reach and basin upstream of a CAMELS basin. Use
- Forcing Generation
- Aggregates gridded time-series meteorological data for each NWM basin in our study area. Use
forcing_cli_camels.py.
- Aggregates gridded time-series meteorological data for each NWM basin in our study area. Use
- Attribute Generation
- Collects static basin attributes for each NWM basin in our study area. Use
camels_attributes.ipynb.
- Collects static basin attributes for each NWM basin in our study area. Use
- Streamflow
- Adds streamflow values NWM retrospective to streamflow-specific basin datasets. Use
get_flow.py.
- Adds streamflow values NWM retrospective to streamflow-specific basin datasets. Use
- Link USGS/NWM
- Custom NeuralHydrology (NH) Classes
- Custom-defined NH models and dataset classes, as well as training methods.
basetrainer.pyandearlystopper.pybelong inneuralhydrology/neuralhydrology/training.config.pybelongs inneuralhydrology/neuralhydrology/utils.modifiedcudalstm.pybelongs inneuralhydrology/neuralhydrology/modelzoo.nwm3retro.pybelongs inneuralhydrology/neuralhydrology/datasetzoo.
- Custom-defined NH models and dataset classes, as well as training methods.
- Model Configurations
- Example model configurations used in NH.
For specific usage instructions for each notebook or script, please view the header docstring or markdown cell.
NetCDF data outputs (forcings and streamflow) are stored at s3://camels-nwm-reanalysis. Note: This bucket is incomplete
We recommend using two separate virtual environments, one for data collection and one for the LSTM.
For data collection, a list of dependencies is listed in 01_data_collection/01_dependencies.txt. To install this, do
python -m venv name_of_your_venv
source name_of_your_venv/bin/activate
pip install -r 01_data_collection/01_dependencies.txt
For the LSTM portion (sections 2 and 3), please follow the official NeuralHydrology instructions. We do recommend using venv instead of conda, though.
Our code is heavily inspired by work done by Josh Cunningham (@JoshCu) and James Halgren (@jameshalgren). This code was authored by Quinn Lee (@quinnylee) and Sonam Lama (@slama0077).