Global Lake Surface Extent Dynamics

This repository contains the complete codebase for producing datasets and reproducing results from the manuscript "Global dominance of seasonality in shaping lake surface extent dynamics". Due to the computationally intensive nature of the analysis and complex runtime environment requirements, we provide a Docker image for deployment on local high-performance computing systems.

Our code is available in two formats:

1. Code Ocean: Provides single-click reproducibility for generating quantitative figures and key numbers from the manuscript. Quantitative figures are named correspondingly, and key numbers can be found in /results/low_water_extreme_plotting.ipynb. Note that Extended Data Fig.1 and 2 is excluded due to quota limitations.

Estimated runtime on Code Ocean: ~ 15 minutes

2. GitHub: Contains the full version of code and provides a docker image that guarantees identical development environment for reproducibility. (Full version for detailed examination)

Estimated time for full reproducibility: several months

Important Notes:

Code Ocean Users: Simply click the Reproducible Run button to execute the analysis. All results will be available in the /results directory.
GitHub Users: Please follow the detailed setup instructions provided below to run the analysis locally.

Reproducing Results (using GitHub version)

System Requirements

Some of the code requires loading large datasets into RAM. A minimum of 64 GB RAM is required (tested on Windows and Linux).

Note: Insufficient RAM may cause the program to crash.

Folder preparation

This manuscript's code must be run within a Docker container to ensure a consistent environment. Since folders from your host machine will be mounted in the container, you'll need to download and organize the code and data in specific directories (necessary modification of the paths in scripts is also required if error occurs since codes are modified to fit with Code Ocean).

Choose a location on your local machine with at least 50 GB of available storage space. We'll refer to this location as your_path. Note that the path convention uses backslashes ('\') for Windows and forward slashes ('/') for Linux and MacOS.
Create two sub-folders: your_path\\code and your_path\\data
Follow the instructions below to download the necessary code and data files.

Docker Installation

Docker is required to reproduce the contents of this manuscript, as it ensures a consistent runtime environment and streamlines the setup process. The Docker image can be pulled from the global-lake-area-runner DockerHub repository.

For Docker Desktop/Engine installation, please refer to the official documentation:

Windows
MacOS
Linux

To automatically download the Docker image, run the following command in your terminal (use Terminal for MacOS/Linux or PowerShell/Terminal for Windows):

docker pull luoqili/global-lake-area-runner:v1.0

Container Creation

Install VS Code and Extensions

Download VS Code from their official website.
Install the Remote Development extension in VS Code, which is required to run Docker containers as development environments.

Open Project Folder in VS Code

Launch VS Code.
Click File > Open Folder.
Navigate to and select your_path\\code\\global_lake_area, which contains all the project code.

Reopen in Container

Verify that the .devcontainer folder appears in the left panel. If not, review the previous steps.
Edit the .devcontainer/devcontainer.json file to configure the correct mounting paths. Modify the "mount" parameter by replacing your_path with your actual path in these locations:

{"source": "your_path\\code", "target": "/WORK/Codes", "type": "bind"} (mount the decompressed GitHub repository folder to /WORK/Codes in the container)
{"source": "your_path\\data", "target": "/WORK/Data", "type": "bind"} (mount the data folder containing required data files (possibility downloaded from Code Ocean) to /WORK/Data in the container)
After installing the Remote Development extension, look for a small blue >< icon in the lower-left corner of the VS Code window.
Click this icon and select Reopen in Container.
Wait briefly while the Docker image builds and opens. Once complete, the development environment will be ready to use.

Running Code and Reproducing Results

For quantitative figures and key numbers, after completing the steps above, follow these instructions (recommended to use Code Ocean instead):

Locate the corresponding .ipynb file listed in the "Locations of quantitative results in the codes" section.
Click Run All.
When prompted to choose a Python kernel, select Python Environments..., then choose Python 3.8.10 /usr/bin/python3.
The code will execute successfully.

Note: Download the required data from our Code Ocean repository and place it in the appropriate folder according to your modified paths.

For the complete workflow, refer to the "Overall workflow" section and ensure all paths are correctly configured.

Troubleshooting

Unable to open Docker container

This issue is most likely due to incorrect path settings. Please ensure all paths are correctly formatted. (For Windows, a correct path format looks like D:\\folder1\\folder2)
Code execution errors (e.g., package not exist, cannot find file path, etc.)

These errors typically occur due to incorrect path mounting in devcontainer.json. Please verify that your file structure follows this pattern: your_path\\code\\global_lake_area\\batch_processing\\... and your_path\\data\\global_lake_area\\area_csvs. In the devcontainer.json file, ensure that the your_path\\code and your_path\\data folders are properly specified. Also check all paths in the codes.
Kernel crashes and memory-related errors (containing keywords like "free", "mem", etc.)

These errors occur when your system's RAM is insufficient for running the script. Please use a high-performance computer with at least 64GB of RAM installed. For Windows 11 users, these issues may be caused by WSL2-based Docker's default memory limitations. In this case, please refer to the official documentation here and here for instructions on configuring the .wslconfig file to allocate at least 64GB RAM to Docker.

File Description and Usage

Other scripts relate to the algorithms described in this manuscript, which require several months of runtime, high-performance computing clusters, and terabytes of input data. Therefore, they are not included in the reproduction process due to time and resource limitations. For more information, brief descriptions of each file are provided below, with detailed usage instructions included within the files themselves.

global_lake_area/ (folder)
- unetgee.py: Provides functions for GEE authentication, U-Net sample generation, training, validation, MODIS and GSW raster export, and U-Net prediction. Many functions in this file are deprecated due to the iteration of the manuscript but are kept for reference.
- unet_train.py: Serves as a command-line interface for U-Net training by calling the unet_train function in unetgee.py.
- UNET_TRAIN_CONFIG.py: Contains configuration settings for a single U-Net training session.
- update_config_unet_train_run.py: Facilitates batch U-Net training by updating UNET_TRAIN_CONFIG.py and executing unet_train.py.
- training_records.csv: Stores metadata for U-Net models, including sample sizes and model metrics.
- update_training_record.py: Updates the training_records.csv file.
- unet_samples_generate_per_basin.ipynb: Jupyter notebook for generating U-Net training samples for each basin.
- unet_sample_size_count.ipynb: Computes sample sizes for U-Net training, evaluation, and validation; updates the training_records.csv file.
- unet_evaluation.py: Functions similarly to unet_train.py, calculating performance metrics for each U-Net model and updating the training_records.csv file.
- UNET_EVALUATION_CONFIG.py: Contains configuration settings for a single U-Net evaluation session.
- unet_evaluation_update_config_and_run.py: Similar to update_config_unet_train_run.py, handles batch performance metrics calculation.
- selfee.py: Implements service-account-related methods for automated authentication to resolve network issues.
- projection_wkt_generation.ipynb: Creates customized Lambert Azimuth Equal Area (LAEA) projections for each basin in the BasinATLAS lev02 product.
- hydrolakes_filter_by_bas.ipynb: Generates lake boundaries for U-Net sample generation (not used for final area calculation).
- gsw_export.ipynb: Handles the export of GSW occurrence and recurrence data.
- gsw_occurrence_and_recurrence_mosaic.py: Creates mosaics from tiled GSW occurrence and recurrence maps.
- export_modis_and_gsw_image.ipynb: Exports MODIS and GSW images in LAEA projections with correct resolutions.
- draw_unet_train_history.py: Generates training and validation curves for each U-Net model.
- add_final_decision_to_records.py: Records the manually-selected optimal epoch in the training_records.csv file.

global_lake_area/.devcontainer (folder)

Contains the devcontainer.json file, which defines the container-based runtime environment required to reproduce the results in this manuscript.

global_lake_area/my_unet_definition (folder)
- __init__.py: Defines this folder as a Python module.
- model.py: Contains implementations of U-Net models (specifically the attentionunet variant was used).
- evaluation_metrics.py: Contains performance metrics and loss functions, including Intersection over Union (IoU).

global_lake_area/my_unet_gdal (folder)
- __init__.py: Initializes this folder as a Python module.
- reproject_to_target.py: Deprecated.
- combined.py: Deprecated.
- zonal_statistics.py: Deprecated.
- reproject_to_target_tile.py: Functions for clipping, reprojecting, and creating mosaics from large GeoTIFF files.
- generate_tfrecord_from_tile.py: Functions for reprojecting, resampling, and converting GeoTIFF files to TFRecord format.
- align_to_target_tile.py: Functions for geographically aligning and merging two raster datasets.
- unet_predictions.py: Functions for processing converted TFRecords using trained U-Net models.
- reconstruct_tile_from_prediction.py: Functions for converting serialized TFRecord files back to GeoTIFF tiles.
- area_calculation.py: Functions for calculating areas from raster data using vector boundaries.
- quick_plotting.py: Functions for generating PNG images and GIF animations from GeoTIFF files.
- quick_plotting_runner.py: Command-line interface for quick_plotting.py that accepts LAEA coordinates as input.

global_lake_area/batch_processing (folder)
- __init__.py: Initializes this folder as a Python module.
- batch_tfrecord_generation.py: Command-line interface for generating batches of MODIS-converted TFRecord files.
- batch_unet_prediction.py: Command-line interface for batch predictions using U-Net.
- batch_prediction_reconstruction.py: Command-line interface for reconstructing water mask maps in batches.
- batch_mosaic.py: Creates mosaics from water mask tiles into a single large GeoTIFF file.
- batch_full.py: Integrates multiple batch processing steps into a single command-line interface.
- asynchronous_batch.py: Executes batch_full.py asynchronously to optimize computing resource utilization.
- BATCH_CONFIG.py: Configuration settings for batch_full.py and asynchronous_batch.py.
- batch_area_calculation.py: Command-line interface for calculating areas from mosaiced water mask maps in batches.
- AREA_CALCULATION_CONFIG.py: Configuration settings for batch_area_calculation.py (monthly lake surface extent results).
- MISSING_DATA_AREA_CALCULATION_CONFIG.py: Configuration settings for batch_area_calculation.py (monthly cloud contamination ratio results).
- MASKED_MY_WATER_AREA_CALCULATION_CONFIG.py: Configuration settings for batch_area_calculation.py (GSW-masked water mask map results).
- GSWR_AREA_CALCULATION_CONFIG.py: Configuration settings for batch_area_calculation.py (GSW image results for validation).
- area_calculation_update_and_run.py: Automatically updates configuration files and executes batch_area_calculation.py.
- load_config_module.py: Provides functionality for reading configuration files in .py format.

global_lake_area/my_plotting (folder)

Contains scripts for visualizing the performance metrics of U-Net models trained in this study.

global_lake_area/my_spatial_analyze (folder)
- __init__.py: Initializes this folder as a Python module.
- area_postprocessing.py: Functions for post-processing lake surface water extracted from U-Net-generated water mask maps.
- lake_wise_area_postprocessor.py: Command-line interface for lake-wise postprocessing of lake surface extent time series.
- LAKE_WISE_AREA_POSTPROCESSING_CONFIG.py: Configuration settings for lake_wise_area_postprocessor.py.
- lake_wise_area_postprocess_update_and_run.py: Automatically updates config files and runs lake_wise_area_postprocessor.py.
- lake_wise_lse_analyze.py: Functions for lake-wise plotting.
- lake_wise_plotting.ipynb: Generates plots (deprecated).
- visualization.py: Functions for grid-wise plotting.
- main_grid.py: Exploratory grid-wise plotting (deprecated).
- lake_concatenator.py: Combines lake-wise time series of surface extent from each basin into one file for global 1.4 million lakes.
- glake_update_hydrolakes.py: Functions and command-line interface for updating HydroLAKES using GLAKES.
- hylak_buffering.py: Removes duplicate lakes and creates buffer zones for GLAKES-updated HydroLAKES.
- gsw_image_mosaic.py: Command-line interface for mosaicking tiled GSW images into one large GeoTIFF file for validation.
- grid_concatenator.py: Deprecated.
- grid_analyze.py: Functions for performing grid-level analysis.
- cloud_cover_ratio_calculater.py: Command-line interface for calculating cloud cover ratios based on boundary size and monthly MODIS cloud contamination area.
- basin_lse_calculation.py: Deprecated.
- attach_geometry_and_generate_grid.py: Functions for creating grids from global (or regional) lakes and calculating corresponding statistics.
- area_to_volume.py: Deprecated.
- area_to_level.py: Deprecated.
- AREA_TO_LEVEL_CONFIG.py: Deprecated.
- area_to_level_batch_converter.py: Deprecated.
- ./data_analyze (folder)
  - ./basin_wise_analysis (folder)
    - basin_wise_analysis.py: Functions for basin-wise analysis and plotting.
    - basin_wise_plotting.ipynb: Generates basin-wise figures (including reservoir contribution).
    - basinatlas_statistics_calculator.py: Command-line interface for calculating BasinATLAS statistics.
    - hydrobasins_merger.py: Merges multiple HydroBASINS shapefile files.
    - hydrobasins_statistics_calculator.py: Command-line interface for calculating HydroBASINS statistics.
  - ./climate_analysis (folder)
    - attach_aridity_index.py: Adds LakeATLAS aridity index to lake surface extent time series.
  - ./correlation_analysis (folder)
    - plotting.ipynb: Generates plots of median relative changes in seasonality by lake size.
    - correlation_plots.py: Functions for plotting relationships between multiple variables.
  - ./extreme_analysis (folder)
    - area_extreme_analysis.py: Functions for identifying seasonality-induced low-water extremes and other analyses.
    - low_water_extreme_analysis.ipynb: Adds extreme-related columns to lake surface extent time series.
    - low_water_extreme_plotting.ipynb: Generates plots of seasonality-induced low-water extremes and seasonality dominance.
  - ./grid_wise_analysis (folder): Contains plots of seasonality changes.
  - ./permafrost_analysis (folder): Adds permafrost type information to lake surface extent time series.
  - ./time_series_analysis (folder): Generates plots of long-term trends.
- ./data_validation (folder): Contains validation using GSW estimates and altimetry-based water levels.

global_lake_area/projection_wkt (folder)

Contains Lambert Azimuthal Equal-Area (LAEA) projection definitions used in this manuscript.

global_lake_area/revision_codes (folder)
- ./accuracy_assessment (folder)
  - sample_generation.py: Generates basin-wise samples for calculating user's and producer's accuracies and F1 scores.
  - sample_generation_runner.ipynb: Performs batch generation of samples.
  - metric_calculation.py: Contains utility functions for calculating user's and producer's accuracies and F1 scores.
  - metric_calculation.ipynb: Calculates user's and producer's accuracies and F1 scores.
  - metric_plotting.ipynb: Plots user's and producer's accuracies and F1 scores.
- ./relative_to_total_area (folder)
  - relative_to_total_area_calculation.ipynb: Calculates the ratio between total variation in lake surface extent and total lake area.
  - relative_to_total_area_plotting.ipynb: Plots the ratio between total variation in lake surface extent and total lake area (Extended Data Fig. 4).
- ./population_density_analysis (folder)
  - population_density_analysis_calculation.ipynb: Adds population density data from BasinATLAS level-06 basins to lake surface extent time series.
  - population_density_analysis_plotting.ipynb: Plots relationships between seasonality dominance, changes, and population density (Extended Data Fig. 5).
- ./high_water_extreme_analysis (folder)
  - high_water_extreme_calculate.ipynb: Detects seasonality-induced high-water extremes and analyzes their magnitude and relative importance.
  - high_water_extreme_plotting.ipynb: Plots the relative importance of seasonality-induced high-water extremes compared to 23-year changes and regular seasonality (Extended Data Fig. 6).
- ./extreme_changes (folder)
  - extreme_changes_calculation.ipynb: Calculates frequency changes of seasonality-induced high- and low-water extremes.
  - extreme_changes_plotting.ipynb: Plots frequency changes of seasonality-induced high- and low-water extremes (Extended Data Fig. 7).
- ./comparison_with_gsw (folder)
  - comparison_with_gsw_calculation.ipynb: Calculates missing areas and counts in GSW monthly history product and our maps (Uses multiple iterations to avoid RAM overflow).
  - comparison_with_gsw_plotting.ipynb: Plots missing areas and counts in GSW monthly history product and our maps (Extended Data Fig. 10).
- ./basin_seasonality_change (folder)
  - basin_seasonality_change_plotting.ipynb: Plots median seasonality changes aggregated by BasinATLAS level-06 basins (Fig. 3c).

Overall workflow

Please note that many details in this section are omitted. For complete information, please refer to the corresponding scripts and manuscript paragraphs.

U-Net Training, Evaluation, and Testing

Export Training, Validation, and Test Samples
Use unet_samples_generate_per_basin.ipynb to generate training, validation, and test samples for each basin (BasinATLAS level-02 basins). This step uses the Python API of Google Earth Engine (GEE) and basin-specific customized Lambert Azimuthal Equal-Area projections located in the directory global_lake_area/projection_wkt/Lambert_Azimuthal_Equal_Area. The samples are saved in TFRecord format to a Google Drive folder, which can be configured by modifying the drive_folder parameter. All preprocessing of MODIS images and the GSW monthly history product is handled by the sample generation function defined in unetgee.py.
Train U-Net models
Use update_config_unet_train_run.py to update the UNET_TRAIN_CONFIG.py file and run unet_train.py sequentially for each basin. Preprocessing of training and validation samples is performed by the unet_train function in unetgee.py. The training status flag for each basin is automatically updated in the training record file training_records.csv.
Select the optimal epoch for each U-Net model manually
Use draw_unet_train_history.py to plot the training and validation history of each U-Net model, then manually select the optimal epoch. Record the optimal epoch in the training_records.csv file using the add_final_decision_to_records.py script.
Evaluate U-Net models on test sets Use unet_evaluation_update_config_and_run.py to update the UNET_EVALUATION_CONFIG.py file and run unet_evaluation.py sequentially for each basin. The performance metrics (IoU and binary accuracy) are recorded in the training_records.csv file.

Export MODIS Monthly Median Composites and GSW Products

Export MODIS Monthly Median Composites
Use export_modis_and_gsw_image.ipynb to export MODIS monthly median composites in Lambert Azimuthal Equal-Area (LAEA) projections. Related functions are defined in unetgee.py. The exported images are saved to a Google Drive folder, which can be configured by modifying the corresponding parameter. The spatial resolution of these exports is 500 m. All preprocessing is handled automatically by the function.
Export GSW Occurrence and Recurrence Maps
Use gsw_export.ipynb to export Global Surface Water (GSW) occurrence and recurrence maps. Related functions are defined in unetgee.py. The exported images are saved to a Google Drive folder, which can be configured by modifying the corresponding parameter. The spatial resolution of these exports is 30 m.
Export GSW Monthly History Product
Use export_modis_and_gsw_image.ipynb to export the GSW monthly history product in LAEA projections. Related functions are defined in unetgee.py. The exported images are saved to a Google Drive folder, which can be configured by modifying the corresponding parameter. The spatial resolution of these exports is 30 m.
Create Mosaics of GSW Products
Use gsw_occurrence_and_recurrence_mosaic.py to create mosaics of GSW occurrence and recurrence maps. Use gsw_image_mosaic.py to create mosaics of the GSW monthly history product. Basic utility functions for these operations are defined in the my_unet_gdal directory.

Batch processing of MODIS images using trained U-Net models to generate raw lake surface extent maps

Note: This section is the most computationally intensive part of this manuscript. It requires high-performance computing clusters and multiple GPUs. This section uses all basic utility functions defined in the my_unet_gdal directory. A brief description of each step is provided below. Detailed usage and definitions of functions and workflows can be found in the corresponding scripts.

Two running modes are provided in this section: asynchronous batch processing (asynchronous_batch.py) and synchronous batch processing (batch_full.py). The former is recommended for high-performance computing clusters, while the latter is recommended for personal computers.

Convert MODIS images to TFRecord format
This step involves reprojecting, resampling, merging MODIS monthly median composites with GSW occurrence and recurrence, and converting them to 128×128 kernels in TFRecord format that can be fed into U-Net models. Use batch_tfrecord_generation.py to generate TFRecord files for different months in one basin.
Predict using U-Net models
This step involves using trained U-Net models to process TFRecord files generated in the previous step. Use batch_unet_prediction.py to generate water mask maps.
Reconstruct water mask maps
This step involves converting serialized TFRecord files (the output of trained U-Net models with MODIS images as input) back to GeoTIFF tiles. Use batch_prediction_reconstruction.py to generate water mask maps.
Mosaic water mask maps
This step combines water mask maps into one large GeoTIFF file. Use batch_mosaic.py to mosaic water mask maps.

Calculate Monthly Lake Surface Extent and Post-processing

Calculate Raw Monthly Lake Surface Extent for Each Lake
Use area_calculation_update_and_run.py to update the AREA_CALCULATION_CONFIG.py file and run batch_area_calculation.py sequentially for different basins. For details of this calculation, please refer to the batch_area_calculation.py file and the calculate_lake_area_grid_parallel function defined in ./my_unet_gdal/area_calculation.py. (A raster-based calculation approach is used)
Frozen Period Determination
Use the Simstrat model provided in LakeEnsemblR to determine the frozen period for each lake.
Cloud-Contaminated Area Calculation
Use area_calculation_update_and_run.py to update the MISSING_DATA_AREA_CALCULATION_CONFIG.py file and run batch_area_calculation.py sequentially for different basins. This calculates the areas identified as clouds by the MODIS QA band.
Cloud Contamination Ratio Calculation
Use ./my_spatial_analyze/cloud_cover_ratio_calculator.py to calculate the cloud contamination ratio for each lake. Basic utility functions are defined in the ./my_spatial_analyze/area_postprocessing.py file.
Process and Analyze Lake Data
Use ./my_spatial_analyze/lake_wise_area_postprocess_update_and_run.py to update the LAKE_WISE_AREA_POSTPROCESSING_CONFIG.py file and run lake_wise_area_postprocessor.py sequentially for different basins. This step filters out cloud-contaminated and frozen data points, and calculates lake-wise statistics including intra-annual standard deviations for further analysis. Basic utility functions are defined in the ./my_spatial_analyze/area_postprocessing.py and ./my_spatial_analyze/attach_geometry_and_generate_grid.py files.

Analysis and Plotting

Attach Additional Properties from LakeATLAS to the Time Series DataFrame
Use ./my_spatial_analyze/data_analyze/climate_analysis/attach_aridity_index.py for aridity index, ./my_spatial_analyze/data_analyze/permafrost_analysis/attach_permafrost_type.py for permafrost type, and ./revision_codes/population_density_analysis/population_density_analysis_calculation.ipynb for population density.
Create Grid Cells
Use ./my_spatial_analyze/grid_analyze.py to create grid cells. Multiple modes and grid sizes are available (1.0 for coarse, 0.5 for medium, 0.25 for fine). All grid-wise figures are generated using this approach from lake-wise statistics.
Create Fig. 1
This illustrative figure is created manually in Adobe Illustrator.
Create Fig. 2
The lake-wise statistic seasonality_dominance_percentage is calculated in ./my_spatial_analyze/data_analyze/extreme_analysis/low_water_extreme_analysis.ipynb following equations in the manuscript. The raw figure is generated in ./my_spatial_analyze/data_analyze/extreme_analysis/low_water_extreme_plotting.ipynb. The final figure is polished in Adobe Illustrator by adjusting the layout, fonts, and annotations.
Create Fig. 3
For subfigures a, b, and d, statistics are calculated in the 'Filter out cloud-contaminated and frozen data points and calculate lake-wise statistics' step and aggregated to grid level by taking the median. For subfigure c, basin-wise statistics are calculated in ./my_spatial_analyze/data_analyze/basin_wise_analysis/basinatlas_statistics_calculator.py. The raw figure is generated in ./my_spatial_analyze/data_analyze/grid_wise_analysis/grid_wise_plotting.ipynb. The final figure is polished in Adobe Illustrator by adjusting the layout, fonts, and annotations.
Create Fig. 4
For subfigures a, b, and c, time series are calculated in ./my_spatial_analyze/data_analyze/time_series_analysis/time_series_plotting.ipynb (results are saved for quick reproduction in the Code Ocean capsule). For subfigure d, basin-wise statistics are calculated in ./my_spatial_analyze/data_analyze/basin_wise_analysis/hydrobasins_statistics_calculator.py. For subfigure e, summed linear trends of STL trend terms are calculated and aggregated to grid level in previous steps. The raw figure is generated in ./my_spatial_analyze/data_analyze/time_series_analysis/time_series_plotting.ipynb. The final figure is polished in Adobe Illustrator by adjusting the layout, fonts, and annotations.
Create Fig. 5
For subfigures a, c, and d, seasonality-induced low-water extremes are detected in ./my_spatial_analyze/data_analyze/extreme_analysis/low_water_extreme_analysis.ipynb. The relative importance of seasonality-induced low-water extremes compared to 23-year changes and regular seasonality is calculated in ./my_spatial_analyze/data_analyze/extreme_analysis/low_water_extreme_plotting.ipynb. Raw versions of subfigures a, c, and d are generated in ./my_spatial_analyze/data_analyze/extreme_analysis/low_water_extreme_plotting.ipynb. The final figure is polished in Adobe Illustrator, with subfigure b manually drawn.
Create Extended Data Fig. 1 and 2
For this comparison, monthly lake surface extent from GSW monthly history product and our maps masked by GSW's mask are calculated using ./batch_processing/batch_area_calculation.py with GSW_AREA_CALCULATION_CONFIG.py and MASKED_MY_WATER_AREA_CALCULATION_CONFIG.py. The raw figure is generated in ./my_spatial_analyze/data_validation/plot_compare_with_gsw.ipynb. The final figure is polished and assembled in Adobe Illustrator.
Create Extended Data Fig. 3
For subfigures a and b, basin-wise binary classification accuracy and IoU of U-Net models are calculated in the U-Net training, evaluation, and test section. The raw figure is generated in ./my_plotting/plotting.ipynb. For subfigures c, d, and e, sample generation and metric calculation are performed in the ./revision_codes/accuracy_assessment directory. The raw figure is generated in ./revision_codes/accuracy_assessment/metric_plotting.ipynb. The final figure is polished and assembled in Adobe Illustrator.
Create Extended Data Fig. 4
Statistics are calculated in ./revision_codes/relative_to_total_area/relatvie_to_total_area_calculation.ipynb. The raw figure is generated in ./revision_codes/relative_to_total_area/relative_to_total_area_plotting.ipynb. The final figure is polished in Adobe Illustrator.
Create Extended Data Fig. 5
Statistics are attached from BasinATLAS level-06 basins in ./revision_codes/population_density_analysis/population_density_analysis_calculation.ipynb. The raw figure is generated in ./revision_codes/population_density_analysis/population_density_analysis_plotting.ipynb. The final figure is polished in Adobe Illustrator.
Create Extended Data Fig. 6
Seasonality-induced high-water extremes and their magnitudes are calculated in ./revision_codes/high_water_extreme_analysis/high_water_extreme_calculate.ipynb. The raw figure is generated in ./revision_codes/high_water_extreme_analysis/high_water_extreme_plotting.ipynb. The final figure is polished in Adobe Illustrator.
Create Extended Data Fig. 7
Frequency changes of seasonality-induced high- and low-water extremes are calculated in ./revision_codes/extreme_changes/extreme_changes_calculation.ipynb. The raw figure is generated in ./revision_codes/extreme_changes/extreme_changes_plotting.ipynb. The final figure is polished in Adobe Illustrator.
Create Extended Data Fig. 8
Subfigure a is created using draw.io and subfigure b is created using LaTeX code. The final figure is polished in Adobe Illustrator.
Create Extended Data Fig. 9
This figure is generated in ./my_spatial_analyze/data_validation/data_validation_nb.ipynb. The final figure is polished in Adobe Illustrator.
Create Extended Data Fig. 10
Missing areas in the GSW monthly history product are calculated using ./batch_processing/batch_area_calculation.py with GSW_MISSING_DATA_AREA_CALCULATION_CONFIG.py. Statistics are calculated in ./revision_codes/comparison_with_gsw/comparison_with_gsw_calculation.ipynb. The raw figure is generated in ./revision_codes/comparison_with_gsw/comparison_with_gsw_plotting.ipynb. The final figure is polished and assembled in Adobe Illustrator.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Global Lake Surface Extent Dynamics

Reproducing Results (using GitHub version)

System Requirements

Folder preparation

Docker Installation

Container Creation

Install VS Code and Extensions

Open Project Folder in VS Code

Reopen in Container

Running Code and Reproducing Results

Troubleshooting

File Description and Usage

Overall workflow

U-Net Training, Evaluation, and Testing

Export MODIS Monthly Median Composites and GSW Products

Batch processing of MODIS images using trained U-Net models to generate raw lake surface extent maps

Calculate Monthly Lake Surface Extent and Post-processing

Analysis and Plotting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.devcontainer		.devcontainer
__pycache__		__pycache__
batch_processing		batch_processing
my_plotting		my_plotting
my_spatial_analyze		my_spatial_analyze
my_unet_definition		my_unet_definition
my_unet_gdal		my_unet_gdal
projection_wkt/Lambert_Azimuthal_Equal_Area		projection_wkt/Lambert_Azimuthal_Equal_Area
revision_codes		revision_codes
LICENSE		LICENSE
README.md		README.md
UNET_EVALUATION_CONFIG.py		UNET_EVALUATION_CONFIG.py
UNET_TRAIN_CONFIG.py		UNET_TRAIN_CONFIG.py
add_final_decision_to_records.py		add_final_decision_to_records.py
draw_unet_train_history.py		draw_unet_train_history.py
export_modis_and_gsw_image.ipynb		export_modis_and_gsw_image.ipynb
gsw_export.ipynb		gsw_export.ipynb
gsw_occurrence_and_recurrence_mosaic.py		gsw_occurrence_and_recurrence_mosaic.py
hydrolakes_filter_by_bas.ipynb		hydrolakes_filter_by_bas.ipynb
projection_wkt_generation.ipynb		projection_wkt_generation.ipynb
selfee.py		selfee.py
training_records.csv		training_records.csv
unet_evaluation.py		unet_evaluation.py
unet_evaluation_update_config_and_run.py		unet_evaluation_update_config_and_run.py
unet_sample_size_count.ipynb		unet_sample_size_count.ipynb
unet_samples_generate_per_basin.ipynb		unet_samples_generate_per_basin.ipynb
unet_train.py		unet_train.py
unetgee.py		unetgee.py
update_config_unet_train_run.py		update_config_unet_train_run.py
update_training_record.py		update_training_record.py

Folders and files

Latest commit

History

Repository files navigation

Global Lake Surface Extent Dynamics

Reproducing Results (using GitHub version)

System Requirements

Folder preparation

Docker Installation

Container Creation

Install VS Code and Extensions

Open Project Folder in VS Code

Reopen in Container

Running Code and Reproducing Results

Troubleshooting

File Description and Usage

Overall workflow

U-Net Training, Evaluation, and Testing

Export MODIS Monthly Median Composites and GSW Products

Batch processing of MODIS images using trained U-Net models to generate raw lake surface extent maps

Calculate Monthly Lake Surface Extent and Post-processing

Analysis and Plotting

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages