This repository contains the complete codebase for producing datasets and reproducing results from the manuscript "Global dominance of seasonality in shaping lake surface extent dynamics". Due to the computationally intensive nature of the analysis and complex runtime environment requirements, we provide a Docker image for deployment on local high-performance computing systems.
Our code is available in two formats:
1. Code Ocean: Provides single-click reproducibility for generating quantitative figures and key numbers from the manuscript. Quantitative figures are named correspondingly, and key numbers can be found in /results/low_water_extreme_plotting.ipynb. Note that Extended Data Fig.1 and 2 is excluded due to quota limitations.
Estimated runtime on Code Ocean: ~ 15 minutes
2. GitHub: Contains the full version of code and provides a docker image that guarantees identical development environment for reproducibility. (Full version for detailed examination)
Estimated time for full reproducibility: several months
Important Notes:
-
Code Ocean Users: Simply click the
Reproducible Runbutton to execute the analysis. All results will be available in the/resultsdirectory. -
GitHub Users: Please follow the detailed setup instructions provided below to run the analysis locally.
Some of the code requires loading large datasets into RAM. A minimum of 64 GB RAM is required (tested on Windows and Linux).
Note: Insufficient RAM may cause the program to crash.
This manuscript's code must be run within a Docker container to ensure a consistent environment. Since folders from your host machine will be mounted in the container, you'll need to download and organize the code and data in specific directories (necessary modification of the paths in scripts is also required if error occurs since codes are modified to fit with Code Ocean).
- Choose a location on your local machine with at least 50 GB of available storage space. We'll refer to this location as
your_path. Note that the path convention uses backslashes ('\') for Windows and forward slashes ('/') for Linux and MacOS. - Create two sub-folders:
your_path\\codeandyour_path\\data - Follow the instructions below to download the necessary code and data files.
Docker is required to reproduce the contents of this manuscript, as it ensures a consistent runtime environment and streamlines the setup process. The Docker image can be pulled from the global-lake-area-runner DockerHub repository.
For Docker Desktop/Engine installation, please refer to the official documentation:
To automatically download the Docker image, run the following command in your terminal (use Terminal for MacOS/Linux or PowerShell/Terminal for Windows):
docker pull luoqili/global-lake-area-runner:v1.0
- Download VS Code from their official website.
- Install the Remote Development extension in VS Code, which is required to run Docker containers as development environments.
- Launch VS Code.
- Click File > Open Folder.
- Navigate to and select
your_path\\code\\global_lake_area, which contains all the project code.
- Verify that the
.devcontainerfolder appears in the left panel. If not, review the previous steps. - Edit the
.devcontainer/devcontainer.jsonfile to configure the correct mounting paths. Modify the "mount" parameter by replacingyour_pathwith your actual path in these locations:{"source": "your_path\\code", "target": "/WORK/Codes", "type": "bind"}(mount the decompressed GitHub repository folder to/WORK/Codesin the container)
{"source": "your_path\\data", "target": "/WORK/Data", "type": "bind"}(mount the data folder containing required data files (possibility downloaded from Code Ocean) to/WORK/Datain the container) - After installing the Remote Development extension, look for a small blue
><icon in the lower-left corner of the VS Code window. - Click this icon and select
Reopen in Container. - Wait briefly while the Docker image builds and opens. Once complete, the development environment will be ready to use.
For quantitative figures and key numbers, after completing the steps above, follow these instructions (recommended to use Code Ocean instead):
- Locate the corresponding
.ipynbfile listed in the "Locations of quantitative results in the codes" section. - Click
Run All. - When prompted to choose a Python kernel, select
Python Environments..., then choosePython 3.8.10 /usr/bin/python3. - The code will execute successfully.
Note: Download the required data from our Code Ocean repository and place it in the appropriate folder according to your modified paths.
For the complete workflow, refer to the "Overall workflow" section and ensure all paths are correctly configured.
-
Unable to open Docker container
This issue is most likely due to incorrect path settings. Please ensure all paths are correctly formatted. (For Windows, a correct path format looks like
D:\\folder1\\folder2) -
Code execution errors (e.g.,
package not exist,cannot find file path, etc.)These errors typically occur due to incorrect path mounting in
devcontainer.json. Please verify that your file structure follows this pattern:your_path\\code\\global_lake_area\\batch_processing\\...andyour_path\\data\\global_lake_area\\area_csvs. In thedevcontainer.jsonfile, ensure that theyour_path\\codeandyour_path\\datafolders are properly specified. Also check all paths in the codes. -
Kernel crashes and memory-related errors (containing keywords like "free", "mem", etc.)
These errors occur when your system's RAM is insufficient for running the script. Please use a high-performance computer with at least 64GB of RAM installed. For Windows 11 users, these issues may be caused by WSL2-based Docker's default memory limitations. In this case, please refer to the official documentation here and here for instructions on configuring the
.wslconfigfile to allocate at least 64GB RAM to Docker.
Other scripts relate to the algorithms described in this manuscript, which require several months of runtime, high-performance computing clusters, and terabytes of input data. Therefore, they are not included in the reproduction process due to time and resource limitations. For more information, brief descriptions of each file are provided below, with detailed usage instructions included within the files themselves.
-
global_lake_area/(folder)unetgee.py: Provides functions for GEE authentication, U-Net sample generation, training, validation, MODIS and GSW raster export, and U-Net prediction. Many functions in this file are deprecated due to the iteration of the manuscript but are kept for reference.unet_train.py: Serves as a command-line interface for U-Net training by calling theunet_trainfunction inunetgee.py.UNET_TRAIN_CONFIG.py: Contains configuration settings for a single U-Net training session.update_config_unet_train_run.py: Facilitates batch U-Net training by updatingUNET_TRAIN_CONFIG.pyand executingunet_train.py.training_records.csv: Stores metadata for U-Net models, including sample sizes and model metrics.update_training_record.py: Updates thetraining_records.csvfile.unet_samples_generate_per_basin.ipynb: Jupyter notebook for generating U-Net training samples for each basin.unet_sample_size_count.ipynb: Computes sample sizes for U-Net training, evaluation, and validation; updates thetraining_records.csvfile.unet_evaluation.py: Functions similarly tounet_train.py, calculating performance metrics for each U-Net model and updating thetraining_records.csvfile.UNET_EVALUATION_CONFIG.py: Contains configuration settings for a single U-Net evaluation session.unet_evaluation_update_config_and_run.py: Similar toupdate_config_unet_train_run.py, handles batch performance metrics calculation.selfee.py: Implements service-account-related methods for automated authentication to resolve network issues.projection_wkt_generation.ipynb: Creates customized Lambert Azimuth Equal Area (LAEA) projections for each basin in the BasinATLAS lev02 product.hydrolakes_filter_by_bas.ipynb: Generates lake boundaries for U-Net sample generation (not used for final area calculation).gsw_export.ipynb: Handles the export of GSW occurrence and recurrence data.gsw_occurrence_and_recurrence_mosaic.py: Creates mosaics from tiled GSW occurrence and recurrence maps.export_modis_and_gsw_image.ipynb: Exports MODIS and GSW images in LAEA projections with correct resolutions.draw_unet_train_history.py: Generates training and validation curves for each U-Net model.add_final_decision_to_records.py: Records the manually-selected optimal epoch in thetraining_records.csvfile.
global_lake_area/.devcontainer(folder)
Contains the devcontainer.json file, which defines the container-based runtime environment required to reproduce the results in this manuscript.
-
global_lake_area/my_unet_definition(folder)__init__.py: Defines this folder as a Python module.model.py: Contains implementations of U-Net models (specifically theattentionunetvariant was used).evaluation_metrics.py: Contains performance metrics and loss functions, including Intersection over Union (IoU).
-
global_lake_area/my_unet_gdal(folder)__init__.py: Initializes this folder as a Python module.reproject_to_target.py: Deprecated.combined.py: Deprecated.zonal_statistics.py: Deprecated.reproject_to_target_tile.py: Functions for clipping, reprojecting, and creating mosaics from large GeoTIFF files.generate_tfrecord_from_tile.py: Functions for reprojecting, resampling, and converting GeoTIFF files to TFRecord format.align_to_target_tile.py: Functions for geographically aligning and merging two raster datasets.unet_predictions.py: Functions for processing converted TFRecords using trained U-Net models.reconstruct_tile_from_prediction.py: Functions for converting serialized TFRecord files back to GeoTIFF tiles.area_calculation.py: Functions for calculating areas from raster data using vector boundaries.quick_plotting.py: Functions for generating PNG images and GIF animations from GeoTIFF files.quick_plotting_runner.py: Command-line interface forquick_plotting.pythat accepts LAEA coordinates as input.
-
global_lake_area/batch_processing(folder)__init__.py: Initializes this folder as a Python module.batch_tfrecord_generation.py: Command-line interface for generating batches of MODIS-converted TFRecord files.batch_unet_prediction.py: Command-line interface for batch predictions using U-Net.batch_prediction_reconstruction.py: Command-line interface for reconstructing water mask maps in batches.batch_mosaic.py: Creates mosaics from water mask tiles into a single large GeoTIFF file.batch_full.py: Integrates multiple batch processing steps into a single command-line interface.asynchronous_batch.py: Executesbatch_full.pyasynchronously to optimize computing resource utilization.BATCH_CONFIG.py: Configuration settings forbatch_full.pyandasynchronous_batch.py.batch_area_calculation.py: Command-line interface for calculating areas from mosaiced water mask maps in batches.AREA_CALCULATION_CONFIG.py: Configuration settings forbatch_area_calculation.py(monthly lake surface extent results).MISSING_DATA_AREA_CALCULATION_CONFIG.py: Configuration settings forbatch_area_calculation.py(monthly cloud contamination ratio results).MASKED_MY_WATER_AREA_CALCULATION_CONFIG.py: Configuration settings forbatch_area_calculation.py(GSW-masked water mask map results).GSWR_AREA_CALCULATION_CONFIG.py: Configuration settings forbatch_area_calculation.py(GSW image results for validation).area_calculation_update_and_run.py: Automatically updates configuration files and executesbatch_area_calculation.py.load_config_module.py: Provides functionality for reading configuration files in.pyformat.
global_lake_area/my_plotting(folder)
Contains scripts for visualizing the performance metrics of U-Net models trained in this study.
-
global_lake_area/my_spatial_analyze(folder)__init__.py: Initializes this folder as a Python module.area_postprocessing.py: Functions for post-processing lake surface water extracted from U-Net-generated water mask maps.lake_wise_area_postprocessor.py: Command-line interface for lake-wise postprocessing of lake surface extent time series.LAKE_WISE_AREA_POSTPROCESSING_CONFIG.py: Configuration settings forlake_wise_area_postprocessor.py.lake_wise_area_postprocess_update_and_run.py: Automatically updates config files and runslake_wise_area_postprocessor.py.lake_wise_lse_analyze.py: Functions for lake-wise plotting.lake_wise_plotting.ipynb: Generates plots (deprecated).visualization.py: Functions for grid-wise plotting.main_grid.py: Exploratory grid-wise plotting (deprecated).lake_concatenator.py: Combines lake-wise time series of surface extent from each basin into one file for global 1.4 million lakes.glake_update_hydrolakes.py: Functions and command-line interface for updating HydroLAKES using GLAKES.hylak_buffering.py: Removes duplicate lakes and creates buffer zones for GLAKES-updated HydroLAKES.gsw_image_mosaic.py: Command-line interface for mosaicking tiled GSW images into one large GeoTIFF file for validation.grid_concatenator.py: Deprecated.grid_analyze.py: Functions for performing grid-level analysis.cloud_cover_ratio_calculater.py: Command-line interface for calculating cloud cover ratios based on boundary size and monthly MODIS cloud contamination area.basin_lse_calculation.py: Deprecated.attach_geometry_and_generate_grid.py: Functions for creating grids from global (or regional) lakes and calculating corresponding statistics.area_to_volume.py: Deprecated.area_to_level.py: Deprecated.AREA_TO_LEVEL_CONFIG.py: Deprecated.area_to_level_batch_converter.py: Deprecated../data_analyze(folder)./basin_wise_analysis(folder)basin_wise_analysis.py: Functions for basin-wise analysis and plotting.basin_wise_plotting.ipynb: Generates basin-wise figures (including reservoir contribution).basinatlas_statistics_calculator.py: Command-line interface for calculating BasinATLAS statistics.hydrobasins_merger.py: Merges multiple HydroBASINS shapefile files.hydrobasins_statistics_calculator.py: Command-line interface for calculating HydroBASINS statistics.
./climate_analysis(folder)attach_aridity_index.py: Adds LakeATLAS aridity index to lake surface extent time series.
./correlation_analysis(folder)plotting.ipynb: Generates plots of median relative changes in seasonality by lake size.correlation_plots.py: Functions for plotting relationships between multiple variables.
./extreme_analysis(folder)area_extreme_analysis.py: Functions for identifying seasonality-induced low-water extremes and other analyses.low_water_extreme_analysis.ipynb: Adds extreme-related columns to lake surface extent time series.low_water_extreme_plotting.ipynb: Generates plots of seasonality-induced low-water extremes and seasonality dominance.
./grid_wise_analysis(folder): Contains plots of seasonality changes../permafrost_analysis(folder): Adds permafrost type information to lake surface extent time series../time_series_analysis(folder): Generates plots of long-term trends.
./data_validation(folder): Contains validation using GSW estimates and altimetry-based water levels.
global_lake_area/projection_wkt(folder)
Contains Lambert Azimuthal Equal-Area (LAEA) projection definitions used in this manuscript.
global_lake_area/revision_codes(folder)./accuracy_assessment(folder)sample_generation.py: Generates basin-wise samples for calculating user's and producer's accuracies and F1 scores.sample_generation_runner.ipynb: Performs batch generation of samples.metric_calculation.py: Contains utility functions for calculating user's and producer's accuracies and F1 scores.metric_calculation.ipynb: Calculates user's and producer's accuracies and F1 scores.metric_plotting.ipynb: Plots user's and producer's accuracies and F1 scores.
./relative_to_total_area(folder)relative_to_total_area_calculation.ipynb: Calculates the ratio between total variation in lake surface extent and total lake area.relative_to_total_area_plotting.ipynb: Plots the ratio between total variation in lake surface extent and total lake area (Extended Data Fig. 4).
./population_density_analysis(folder)population_density_analysis_calculation.ipynb: Adds population density data from BasinATLAS level-06 basins to lake surface extent time series.population_density_analysis_plotting.ipynb: Plots relationships between seasonality dominance, changes, and population density (Extended Data Fig. 5).
./high_water_extreme_analysis(folder)high_water_extreme_calculate.ipynb: Detects seasonality-induced high-water extremes and analyzes their magnitude and relative importance.high_water_extreme_plotting.ipynb: Plots the relative importance of seasonality-induced high-water extremes compared to 23-year changes and regular seasonality (Extended Data Fig. 6).
./extreme_changes(folder)extreme_changes_calculation.ipynb: Calculates frequency changes of seasonality-induced high- and low-water extremes.extreme_changes_plotting.ipynb: Plots frequency changes of seasonality-induced high- and low-water extremes (Extended Data Fig. 7).
./comparison_with_gsw(folder)comparison_with_gsw_calculation.ipynb: Calculates missing areas and counts in GSW monthly history product and our maps (Uses multiple iterations to avoid RAM overflow).comparison_with_gsw_plotting.ipynb: Plots missing areas and counts in GSW monthly history product and our maps (Extended Data Fig. 10).
./basin_seasonality_change(folder)basin_seasonality_change_plotting.ipynb: Plots median seasonality changes aggregated by BasinATLAS level-06 basins (Fig. 3c).
Please note that many details in this section are omitted. For complete information, please refer to the corresponding scripts and manuscript paragraphs.
-
Export Training, Validation, and Test Samples
Useunet_samples_generate_per_basin.ipynbto generate training, validation, and test samples for each basin (BasinATLAS level-02 basins). This step uses the Python API of Google Earth Engine (GEE) and basin-specific customized Lambert Azimuthal Equal-Area projections located in the directoryglobal_lake_area/projection_wkt/Lambert_Azimuthal_Equal_Area. The samples are saved in TFRecord format to a Google Drive folder, which can be configured by modifying thedrive_folderparameter. All preprocessing of MODIS images and the GSW monthly history product is handled by the sample generation function defined inunetgee.py. -
Train U-Net models
Useupdate_config_unet_train_run.pyto update theUNET_TRAIN_CONFIG.pyfile and rununet_train.pysequentially for each basin. Preprocessing of training and validation samples is performed by theunet_trainfunction inunetgee.py. The training status flag for each basin is automatically updated in the training record filetraining_records.csv. -
Select the optimal epoch for each U-Net model manually
Usedraw_unet_train_history.pyto plot the training and validation history of each U-Net model, then manually select the optimal epoch. Record the optimal epoch in thetraining_records.csvfile using theadd_final_decision_to_records.pyscript. -
Evaluate U-Net models on test sets Use
unet_evaluation_update_config_and_run.pyto update theUNET_EVALUATION_CONFIG.pyfile and rununet_evaluation.pysequentially for each basin. The performance metrics (IoU and binary accuracy) are recorded in thetraining_records.csvfile.
-
Export MODIS Monthly Median Composites
Useexport_modis_and_gsw_image.ipynbto export MODIS monthly median composites in Lambert Azimuthal Equal-Area (LAEA) projections. Related functions are defined inunetgee.py. The exported images are saved to a Google Drive folder, which can be configured by modifying the corresponding parameter. The spatial resolution of these exports is 500 m. All preprocessing is handled automatically by the function. -
Export GSW Occurrence and Recurrence Maps
Usegsw_export.ipynbto export Global Surface Water (GSW) occurrence and recurrence maps. Related functions are defined inunetgee.py. The exported images are saved to a Google Drive folder, which can be configured by modifying the corresponding parameter. The spatial resolution of these exports is 30 m. -
Export GSW Monthly History Product
Useexport_modis_and_gsw_image.ipynbto export the GSW monthly history product in LAEA projections. Related functions are defined inunetgee.py. The exported images are saved to a Google Drive folder, which can be configured by modifying the corresponding parameter. The spatial resolution of these exports is 30 m. -
Create Mosaics of GSW Products
Usegsw_occurrence_and_recurrence_mosaic.pyto create mosaics of GSW occurrence and recurrence maps. Usegsw_image_mosaic.pyto create mosaics of the GSW monthly history product. Basic utility functions for these operations are defined in themy_unet_gdaldirectory.
Batch processing of MODIS images using trained U-Net models to generate raw lake surface extent maps
Note: This section is the most computationally intensive part of this manuscript. It requires high-performance computing clusters and multiple GPUs. This section uses all basic utility functions defined in the my_unet_gdal directory. A brief description of each step is provided below. Detailed usage and definitions of functions and workflows can be found in the corresponding scripts.
Two running modes are provided in this section: asynchronous batch processing (asynchronous_batch.py) and synchronous batch processing (batch_full.py). The former is recommended for high-performance computing clusters, while the latter is recommended for personal computers.
-
Convert MODIS images to TFRecord format
This step involves reprojecting, resampling, merging MODIS monthly median composites with GSW occurrence and recurrence, and converting them to 128×128 kernels in TFRecord format that can be fed into U-Net models. Usebatch_tfrecord_generation.pyto generate TFRecord files for different months in one basin. -
Predict using U-Net models
This step involves using trained U-Net models to process TFRecord files generated in the previous step. Usebatch_unet_prediction.pyto generate water mask maps. -
Reconstruct water mask maps
This step involves converting serialized TFRecord files (the output of trained U-Net models with MODIS images as input) back to GeoTIFF tiles. Usebatch_prediction_reconstruction.pyto generate water mask maps. -
Mosaic water mask maps
This step combines water mask maps into one large GeoTIFF file. Usebatch_mosaic.pyto mosaic water mask maps.
-
Calculate Raw Monthly Lake Surface Extent for Each Lake
Usearea_calculation_update_and_run.pyto update theAREA_CALCULATION_CONFIG.pyfile and runbatch_area_calculation.pysequentially for different basins. For details of this calculation, please refer to thebatch_area_calculation.pyfile and thecalculate_lake_area_grid_parallelfunction defined in./my_unet_gdal/area_calculation.py. (A raster-based calculation approach is used) -
Frozen Period Determination
Use the Simstrat model provided in LakeEnsemblR to determine the frozen period for each lake. -
Cloud-Contaminated Area Calculation
Usearea_calculation_update_and_run.pyto update theMISSING_DATA_AREA_CALCULATION_CONFIG.pyfile and runbatch_area_calculation.pysequentially for different basins. This calculates the areas identified as clouds by the MODIS QA band. -
Cloud Contamination Ratio Calculation
Use./my_spatial_analyze/cloud_cover_ratio_calculator.pyto calculate the cloud contamination ratio for each lake. Basic utility functions are defined in the./my_spatial_analyze/area_postprocessing.pyfile. -
Process and Analyze Lake Data
Use./my_spatial_analyze/lake_wise_area_postprocess_update_and_run.pyto update theLAKE_WISE_AREA_POSTPROCESSING_CONFIG.pyfile and runlake_wise_area_postprocessor.pysequentially for different basins. This step filters out cloud-contaminated and frozen data points, and calculates lake-wise statistics including intra-annual standard deviations for further analysis. Basic utility functions are defined in the./my_spatial_analyze/area_postprocessing.pyand./my_spatial_analyze/attach_geometry_and_generate_grid.pyfiles.
-
Attach Additional Properties from LakeATLAS to the Time Series DataFrame
Use./my_spatial_analyze/data_analyze/climate_analysis/attach_aridity_index.pyfor aridity index,./my_spatial_analyze/data_analyze/permafrost_analysis/attach_permafrost_type.pyfor permafrost type, and./revision_codes/population_density_analysis/population_density_analysis_calculation.ipynbfor population density. -
Create Grid Cells
Use./my_spatial_analyze/grid_analyze.pyto create grid cells. Multiple modes and grid sizes are available (1.0 for coarse, 0.5 for medium, 0.25 for fine). All grid-wise figures are generated using this approach from lake-wise statistics. -
Create Fig. 1
This illustrative figure is created manually in Adobe Illustrator. -
Create Fig. 2
The lake-wise statisticseasonality_dominance_percentageis calculated in./my_spatial_analyze/data_analyze/extreme_analysis/low_water_extreme_analysis.ipynbfollowing equations in the manuscript. The raw figure is generated in./my_spatial_analyze/data_analyze/extreme_analysis/low_water_extreme_plotting.ipynb. The final figure is polished in Adobe Illustrator by adjusting the layout, fonts, and annotations. -
Create Fig. 3
For subfigures a, b, and d, statistics are calculated in the 'Filter out cloud-contaminated and frozen data points and calculate lake-wise statistics' step and aggregated to grid level by taking the median. For subfigure c, basin-wise statistics are calculated in./my_spatial_analyze/data_analyze/basin_wise_analysis/basinatlas_statistics_calculator.py. The raw figure is generated in./my_spatial_analyze/data_analyze/grid_wise_analysis/grid_wise_plotting.ipynb. The final figure is polished in Adobe Illustrator by adjusting the layout, fonts, and annotations. -
Create Fig. 4
For subfigures a, b, and c, time series are calculated in./my_spatial_analyze/data_analyze/time_series_analysis/time_series_plotting.ipynb(results are saved for quick reproduction in the Code Ocean capsule). For subfigure d, basin-wise statistics are calculated in./my_spatial_analyze/data_analyze/basin_wise_analysis/hydrobasins_statistics_calculator.py. For subfigure e, summed linear trends of STL trend terms are calculated and aggregated to grid level in previous steps. The raw figure is generated in./my_spatial_analyze/data_analyze/time_series_analysis/time_series_plotting.ipynb. The final figure is polished in Adobe Illustrator by adjusting the layout, fonts, and annotations. -
Create Fig. 5
For subfigures a, c, and d, seasonality-induced low-water extremes are detected in./my_spatial_analyze/data_analyze/extreme_analysis/low_water_extreme_analysis.ipynb. The relative importance of seasonality-induced low-water extremes compared to 23-year changes and regular seasonality is calculated in./my_spatial_analyze/data_analyze/extreme_analysis/low_water_extreme_plotting.ipynb. Raw versions of subfigures a, c, and d are generated in./my_spatial_analyze/data_analyze/extreme_analysis/low_water_extreme_plotting.ipynb. The final figure is polished in Adobe Illustrator, with subfigure b manually drawn. -
Create Extended Data Fig. 1 and 2
For this comparison, monthly lake surface extent from GSW monthly history product and our maps masked by GSW's mask are calculated using./batch_processing/batch_area_calculation.pywithGSW_AREA_CALCULATION_CONFIG.pyandMASKED_MY_WATER_AREA_CALCULATION_CONFIG.py. The raw figure is generated in./my_spatial_analyze/data_validation/plot_compare_with_gsw.ipynb. The final figure is polished and assembled in Adobe Illustrator. -
Create Extended Data Fig. 3
For subfigures a and b, basin-wise binary classification accuracy and IoU of U-Net models are calculated in the U-Net training, evaluation, and test section. The raw figure is generated in./my_plotting/plotting.ipynb. For subfigures c, d, and e, sample generation and metric calculation are performed in the./revision_codes/accuracy_assessmentdirectory. The raw figure is generated in./revision_codes/accuracy_assessment/metric_plotting.ipynb. The final figure is polished and assembled in Adobe Illustrator. -
Create Extended Data Fig. 4
Statistics are calculated in./revision_codes/relative_to_total_area/relatvie_to_total_area_calculation.ipynb. The raw figure is generated in./revision_codes/relative_to_total_area/relative_to_total_area_plotting.ipynb. The final figure is polished in Adobe Illustrator. -
Create Extended Data Fig. 5
Statistics are attached from BasinATLAS level-06 basins in./revision_codes/population_density_analysis/population_density_analysis_calculation.ipynb. The raw figure is generated in./revision_codes/population_density_analysis/population_density_analysis_plotting.ipynb. The final figure is polished in Adobe Illustrator. -
Create Extended Data Fig. 6
Seasonality-induced high-water extremes and their magnitudes are calculated in./revision_codes/high_water_extreme_analysis/high_water_extreme_calculate.ipynb. The raw figure is generated in./revision_codes/high_water_extreme_analysis/high_water_extreme_plotting.ipynb. The final figure is polished in Adobe Illustrator. -
Create Extended Data Fig. 7
Frequency changes of seasonality-induced high- and low-water extremes are calculated in./revision_codes/extreme_changes/extreme_changes_calculation.ipynb. The raw figure is generated in./revision_codes/extreme_changes/extreme_changes_plotting.ipynb. The final figure is polished in Adobe Illustrator. -
Create Extended Data Fig. 8
Subfigure a is created using draw.io and subfigure b is created using LaTeX code. The final figure is polished in Adobe Illustrator. -
Create Extended Data Fig. 9
This figure is generated in./my_spatial_analyze/data_validation/data_validation_nb.ipynb. The final figure is polished in Adobe Illustrator. -
Create Extended Data Fig. 10
Missing areas in the GSW monthly history product are calculated using./batch_processing/batch_area_calculation.pywithGSW_MISSING_DATA_AREA_CALCULATION_CONFIG.py. Statistics are calculated in./revision_codes/comparison_with_gsw/comparison_with_gsw_calculation.ipynb. The raw figure is generated in./revision_codes/comparison_with_gsw/comparison_with_gsw_plotting.ipynb. The final figure is polished and assembled in Adobe Illustrator.