Skip to content

Conversation

@chengzhuzhang
Copy link
Collaborator

@chengzhuzhang chengzhuzhang commented Apr 2, 2025

Summary

Objectives: performance improvement by:

  • simplify plotting and save to pdf function
  • load data once for all regions.

When analyses the 30 year log that including all land variables: /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250331_try2/v3.LR.historical_0051/post/scripts/global_time_series_1985-2014.o719959

I tried to understand and analyze the zppy-interface results and code, and I found that there are 2 hours spent each time before processing a new region (n,s,glb). This PR attempted to reduce time by loading all regions at the same time. The total time for zi-interface to produce land vars can reduce about 50% (more than 6 hours -> 3hours)

@forsyth2 please review and see if this is acceptable. I did a full tests on Chrysalis, but not including the original plots, following the below commandline:

zi-global-time-series --use_ocn False --input /lcrc/group/e3sm2/ac.wlin/E3SMv3/v3.LR.historical_0051 --input_subdir archive/ocn/hist --moc_file mocTimeSeries_1985-2014.nc --case_dir /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250331_try2/v3.LR.historical_0051 --experiment_name v3.LR.historical_0051 --figstr v3.LR.historical_0051 --color Blue --ts_num_years 30 --plots_original None --plots_atm TREFHT --plots_ice None --plots_lnd all --plots_ocn None --nrows 4 --ncols 2 --results_dir ./zi --regions glb,n,s --make_viewer True --start_yr 1985 --end_yr 2014

There should be more room for improvement of the code and performance.

Issue resolution:

  • Closes #<ISSUE_NUMBER_HERE>

Select one: This pull request is...

  • a bug fix: increment the patch version
  • a small improvement: increment the minor version
  • a new feature: increment the minor version
  • an incompatible (non-backwards compatible) API change: increment the major version

Please fill out either the "Small Change" or "Big Change" section (the latter includes the numbered subsections), and delete the other.

Small Change

  • To merge, I will use "Squash and merge". That is, this change should be a single commit.
  • Logic: I have visually inspected the entire pull request myself.
  • Pre-commit checks: All the pre-commits checks have passed.

Big Change

  • To merge, I will use "Create a merge commit". That is, this change is large enough to require multiple units of work (i.e., it should be multiple commits).

1. Does this do what we want it to do?

Required:

  • Product Management: I have confirmed with the stakeholders that the objectives above are correct and complete.
  • Testing: I have considered likely and/or severe edge cases and have included them in testing.

2. Are the implementation details accurate & efficient?

Required:

  • Logic: I have visually inspected the entire pull request myself.
  • Logic: I have left GitHub comments highlighting important pieces of code logic. I have had these code blocks reviewed by at least one other team member.

If applicable:

  • Dependencies: This pull request introduces a new dependency. I have discussed this requirement with at least one other team member. The dependency is noted in zppy-interfaces/conda, not just an import statement.

3. Is this well documented?

Required:

  • Documentation: by looking at the docs, a new user could easily understand the functionality introduced by this pull request.

4. Is this code clean?

Required:

  • Readability: The code is as simple as possible and well-commented, such that a new team member could understand what's happening.
  • Pre-commit checks: All the pre-commits checks have passed.

If applicable:

  • Software architecture: I have discussed relevant trade-offs in design decisions with at least one other team member. It is unlikely that this pull request will increase tech debt.

- Only create multi-figure PDFs for 'original' component
- For non-original components, create individual PNGs for each plot
- This removes the bottleneck of creating large multi-page PDFs for components
  that are primarily used for individual image display in the viewer
- Reduce DPI from 150 to 100 for faster rendering and smaller files
- Improve code organization and readability
This significant optimization restructures the data loading pipeline to load
NetCDF files only once for all regions, rather than multiple times (once per
region). Since each NetCDF file already contains data for all regions (global,
northern hemisphere, southern hemisphere), this optimization:

1. Reduces I/O operations dramatically - files are read just once
2. Improves memory efficiency - data is loaded once and reused
3. Decreases processing time - particularly for multi-region analyses

Implementation:
- Added new load_all_region_data function to load data for all regions at once
- Created extract_region_data function to extract specific region data
- Restructured the data flow in coupled_global to process regions after data load
- Maintains backward compatibility with existing API
- Better error handling and reporting for problematic variables

This is a significant performance enhancement for use cases that process
multiple regions from the same dataset.
Copy link
Collaborator

@forsyth2 forsyth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chengzhuzhang. Visual review looks mostly ok. This is a significant refactor, so I want to test the integration tests + the min cases from E3SM-Project/zppy#697 (comment) + the example cfg in E3SM-Project/zppy#694.

Comment on lines +639 to +645
# For non-original components: Create individual PNGs (one plot per file)
else:
# Process each plot individually
for i, plot_name in enumerate(plot_list):
# Create single-plot figure
fig = plt.figure(figsize=[13.5 / 2, 16.5 / 4])
ax = fig.add_subplot(111)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is different than the design spec as described in E3SM-Project/zppy#697. In the current code, users can specify make_viewer or not. If they don't, then they can specify nrows and ncols to make custom m x n plot PDFs. The change in this PR appears to dictate that for the user. Original = multi-plot PDF, component = individual plot.

Copy link
Collaborator

@forsyth2 forsyth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chengzhuzhang I ran the min-case cfg files from E3SM-Project/zppy#697, and a reduced version of the example cfg in E3SM-Project/zppy#694. The latter is still running (so far, for 2h12m). The results of the former I have summarized below.

Testing steps
cd ~/ez/zppy-interfaces
git status # Make sure no changes will persist
git fetch upstream
git checkout -b load-data-once-for-all-regions upstream/load-data-once-for-all-regions
git rebase upstream/main # Make sure these commits are off latest `main`
git log
conda clean --all --y
conda env create -f conda/dev.yml -n zi_pr19_20250402
conda activate zi_pr19_20250402 && pip install . # Make sure to `pip install` only after activating the correct env!
pytest tests/unit/global_time_series/test_*.py
# 10 passed in 24.25s

cd ~/ez/zppy
git status # Make sure no changes will persist
git fetch upstream main
git checkout -b test_zppy_zi_pr19_20250402 upstream/main
git log
conda clean --all --y
conda env create -f conda/dev.yml -n zppy_dev_zi_pr19_20250402
conda activate zppy_dev_zi_pr19_20250402 && pip install . # Make sure to `pip install` only after activating the correct env!
pytest tests/test_*.py
# 25 passed in 0.27s


# Edit tests/integration/utils.py
# UNIQUE_ID = "test_zppy_zi_pr19_20250402"
#        "diags_environment_commands": "",
#        "environment_commands_test": "source /lcrc/soft/climate/e3sm-unified/test_e3sm_unified_1.11.0rc13_chrysalis.sh",
#        "global_time_series_environment_commands": "source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate zi_pr19_20250402",
# Set: generate_cfgs(unified_testing=True, dry_run=False)
# Keep:
# "user_input_v2": "/lcrc/group/e3sm/ac.forsyth2/",
# "user_input_v3": "/lcrc/group/e3sm2/ac.wlin/",

python tests/integration/utils.py
git add -A # To make it clear if there are later changes

# Run zppy
# Same display order as in https://github.com/E3SM-Project/zppy/pull/697#issuecomment-2762855613
zppy -c tests/integration/generated/test_min_case_global_time_series_viewers_chrysalis.cfg
zppy -c tests/integration/generated/test_min_case_global_time_series_viewers_all_land_variables_chrysalis.cfg
zppy -c tests/integration/generated/test_min_case_global_time_series_viewers_undefined_variables_chrysalis.cfg
zppy -c tests/integration/generated/test_min_case_global_time_series_original_8_missing_ocn_chrysalis.cfg
zppy -c tests/integration/generated/test_min_case_global_time_series_viewers_original_8_chrysalis.cfg
zppy -c tests/integration/generated/test_min_case_global_time_series_viewers_original_atm_plus_land_chrysalis.cfg
emacs examples/post.v3.LR.historical.zppy_v3.cfg
# Copy cfg from https://github.com/E3SM-Project/zppy/pull/694/files
# Update output, www
# Remove irrelevant tasks/subtasks
zppy -c examples/post.v3.LR.historical.zppy_v3.cfg

# Examine output
cd /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_global_time_series_viewers_output/test_zppy_zi_pr19_20250402/v3.LR.historical_0051/post/scripts
grep -v "OK" *status
cd /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_global_time_series_viewers_all_land_variables_output/test_zppy_zi_pr19_20250402/v3.LR.historical_0051/post/scripts
grep -v "OK" *status
cd /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_global_time_series_viewers_undefined_variables_output/test_zppy_zi_pr19_20250402/v3.LR.historical_0051/post/scripts
grep -v "OK" *status
cd /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_global_time_series_original_8_missing_ocn_output/test_zppy_zi_pr19_20250402/v3.LR.historical_0051/post/scripts
grep -v "OK" *status
cd /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_global_time_series_viewers_original_8_output/test_zppy_zi_pr19_20250402/v3.LR.historical_0051/post/scripts
grep -v "OK" *status
cd /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_global_time_series_viewers_original_atm_plus_land_output/test_zppy_zi_pr19_20250402/v3.LR.historical_0051/post/scripts
grep -v "OK" *status
cd /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250402_try1/v3.LR.historical_0051/post/scripts
grep -v "OK" *status
# Still running

So far, only viewers_original_8 appears to have problems. The behavior on main is to generate a 8-plot PDF and link it under "Output Sets". The behavior on this PR is to generate an 8-plot PDF (2 of which are blank) and link it under a directory listing.

Full min-case cfg results

I copied the table from E3SM-Project/zppy#697 (comment) and added a column for any changes in behavior:

Test cfg ts:lnd_monthly_glb vars= plots_original= plots_lnd= Viewer link Original Behavior New Behavior
viewers 20+ vars "" 20+ vars viewer atm, lnd link to tables. No original link. No change
viewers_all_land_variables 20+ vars "" "all" viewer atm, lnd link to tables. No original link. That is, with the available variables constricted to the same 20+ vars as in the viewers test cfg, the same plot set generated. No change
viewers_undefined_variables "" "" "all" viewer atm, lnd link to tables. Very large number of lnd variables listed. No change
original_8_missing_ocn N/A default 8 plots default "" N/A Viewer not requested. No results page either, because mpas_analysis task is missing in the cfg and thus global_time_series gets skipped. This is expected behavior. No change
viewers_original_8 N/A default 8 plots default "" viewer In this case, make_viewer=True but the only requested component is the original 8 plots. This ends up looking a little weird, but it works. Clicking the link brings us to an 8-plot single-page PDF There is now a directory list instead of a Viewer-type link to a PDF. Only 6/8 plots appear on the single-page PDF now
viewers_original_atm_plus_land "" 5 atm plots 4 vars viewer atm, lnd link to tables. original links to a 5-plot PDF No Change

@chengzhuzhang
Copy link
Collaborator Author

@chengzhuzhang I ran the min-case cfg files from E3SM-Project/zppy#697, and a reduced version of the example cfg in E3SM-Project/zppy#694. The latter is still running (so far, for 2h12m). The results of the former I have summarized below.
Testing steps

So far, only viewers_original_8 appears to have problems. The behavior on main is to generate a 8-plot PDF and link it under "Output Sets". The behavior on this PR is to generate an 8-plot PDF (2 of which are blank) and link it under a directory listing.
Full min-case cfg results

I think both behaviors are acceptable, the behavior of the this PR preserves what is in the version in current e3sm-unified? The only potential concern is where are the other single plots (png) will be saved? if they are in the same directory, than it can be very crowded to find the original 8-panel plots.

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 2, 2025

I think both behaviors are acceptable

Well, we do need to resolve the 2 missing plots (this PR skips "Change in ocean heat content" and "Change in sea level").

The only potential concern is where are the other single plots (png) will be saved? if they are in the same directory,

Yes, they get stored in the same directory. I recall getting the viewers to correctly find the images in the first place was a major difficulty, so if we did change the directory, we'd need to make sure the Viewers can still find the source images.

This is different than the design spec

Per #19 (comment), I have confirmed this behavior change does come up in testing.
On main we'd get a directory listing of multi-plot PNGs and PDFs that group together those PNGs, but on this PR is a directory listing of single-plots PNGs, and no PDF at all.

Additional test
# This tests plots_lnd without the viewer (4x2 PDF)
zppy -c tests/integration/generated/test_min_case_global_time_series_custom_chrysalis.cfg
cd /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_global_time_series_custom_output/test_zppy_zi_pr19_20250402/v3.LR.historical_0051/post/scripts
grep -v "OK" *status 

The www is a directory listing of PNGs, no PDF.

Let's take a look at behavior on main. We can make a few edits to tests/integration/generated/test_min_case_global_time_series_custom_chrysalis.cfg directly:

# Comment out environment_commands for `global_time_series`, so that Unified rc13 is used
#environment_commands = "source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate zi_pr19_20250402"

# Set output and www:
output = "/lcrc/group/e3sm/ac.forsyth2/zppy_min_case_global_time_series_custom_output/test_zppy_zi_pr19_20250402_change_off/v3.LR.historical_0051"
www = "/lcrc/group/e3sm/public_html/diagnostic_output/ac.forsyth2/zppy_min_case_global_time_series_custom_www/test_zppy_zi_pr19_20250402_change_off"

Then run:

zppy -c tests/integration/generated/test_min_case_global_time_series_custom_chrysalis.cfg
# RUNNING
cd /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_global_time_series_custom_output/test_zppy_zi_pr19_20250402_change_off/v3.LR.historical_0051/post/scripts
grep -v "OK" *status

The www is a directory listing of multi-plot PNGs and PDFs that group together those PNGs.

@chengzhuzhang
Copy link
Collaborator Author

In this case, we can consider remove this commit d92f093 that changes plotting behavior. My test shows the other commit (load all regions) primarily contribute to the performance gain.

@chengzhuzhang
Copy link
Collaborator Author

@forsyth2 Another option we can propose is to not fix the performance issue for now to not further delay e3sm-unified. In this case, we provide a separate cfg for enable all land variables. This is a fallback, but is perhaps more reasonable in this situation?

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 2, 2025

This is different than the design spec

We need to have a clear decision on the design specification that we can refer back to.

As of main, the behavior is:

Plotting request make_viewer=False make_viewer=True
Only original Directory listing > nrows x ncols PDF Viewer > Original > nrows x ncols PDF
Only component Directory listing > nrows x ncols PDF per component Viewer > component table > links to individual plots
Both untested, presumably combination of above Viewer > Original > nrows x ncols PDF + Viewer > component table > links to individual plots

As of this PR, the behavior is:

Plotting request make_viewer=False make_viewer=True
Only original Directory listing > nrows x ncols PDF Directory listing > nrows x ncols PDF (Different behavior!)
Only component Directory listing > individual plot PNGs (Different behavior!) Viewer > component table > links to individual plots
Both untested, presumably combination of above Viewer > Original > nrows x ncols PDF + Viewer > component table > links to individual plots

@chengzhuzhang Are either of those the desired design specification?

Which "min-case" testing cfg corresponds with which cell in the above tables?

Plotting request make_viewer=False make_viewer=True
Only original original_8_missing_ocn viewers_original_8
Only component custom viewers, viewers_all_land_variables, viewers_undefined_variables
Both N/A viewers_original_atm_plus_land

@chengzhuzhang
Copy link
Collaborator Author

We need to have a clear decision on the design specification that we can refer back to.

Thanks for the summary. Sounds good. We can probably get more input from science groups once this feature is released and exercised more in evaluation.

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 2, 2025

not fix the performance issue for now to not further delay e3sm-unified
We can probably get more input from science groups once this feature is released and exercised more in evaluation.

@chengzhuzhang I think this makes sense. For now, we'll have the expected output from the first table (main) as our official design specification.

reduced version of the example cfg in E3SM-Project/zppy#694. The latter is still running (so far, for 2h12m).

I am going to let this finish running to see if any other issues pop up. Currently at 3h38m

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 2, 2025

I am going to let this finish running

Never mind, it wasn't even running in the right environment. I forgot that had to be updated manually, unlike the auto-generated min-case cfg's...

> cd /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250402_try1/v3.LR.historical_0051/post/scripts
> emacs global_time_series_1985-2014.settings 
  'environment_commands': 'source '
                          '/gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; '
                          'conda activate zi_plots_lnd_20250326',

Which is obviously not the environment source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate zi_pr19_20250402 used in my min-case tests...

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 2, 2025

I do want to see the actual results though:

cd /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250402_try1/v3.LR.historical_0051/post/scripts
emacs global_time_series_1985-2014.bash
# Manually change the environment call
sbatch global_time_series_1985-2014.bash

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 2, 2025

Ok actually using the correct environment, we do indeed see a faster runtime:

cd /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250402_try1/v3.LR.historical_0051/post/scripts
grep -v "OK" *status
# No errors
grep -in "Elapsed time" global_time_series_1985-2014.o*
# global_time_series_1985-2014.o721350:3479:Elapsed time: 8292 seconds

That's 8,292 seconds = 138.2 minutes = 2.3 hours to run:

results_dir=global_time_series_1985-2014_results

zi-global-time-series --use_ocn True --input /lcrc/group/e3sm2/ac.wlin/E3SMv3/v3.LR.historical_0051 --input_subdir archive/ocn/hist --moc_file mocTimeSeries_1985-2014.nc --case_dir /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250402_try1/v3.LR.historical_0051 --experiment_name v3.LR.historical_0051 --figstr v3.LR.historical_0051 --color Blue --ts_num_years 30 --plots_original net_toa_flux_restom,global_surface_air_temperature,toa_radiation,net_atm_energy_imbalance,change_ohc,max_moc,change_sea_level,net_atm_water_imbalance --plots_atm TREFHT,AODDUST --plots_ice None --plots_lnd all --plots_ocn None --nrows 4 --ncols 2 --results_dir ${results_dir} --regions glb,n,s --make_viewer True --start_yr 1985 --end_yr 2014

@chengzhuzhang It appears your changes do make the code faster. I think it was taking 6.7 hours before, so this PR is a 2.9x speedup.

www results however suggest there is no viewer created despite make_viewer=True. There is a PDF for each: atm, lnd, and original (again missing 2 plots)

So, I think we're definitely on the right track here with the performance, but there are a few issues to work out first.

chengzhuzhang and others added 2 commits April 2, 2025 20:28
1. Fix issue with missing plots in original component for non-global regions
   - Handle global-only plots in non-global regions by creating empty plots
   - Ensures all plots appear in multi-plot pages

2. Fix viewer creation when make_viewer=True
   - Create viewers for all components with requested plots
   - Properly link to generated PNGs in the viewer

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@chengzhuzhang
Copy link
Collaborator Author

Good to know this PR is likely on the right track to improve performance. I tried to fix the changed behavior for the original plots and viewer last night but I think it needs clear separation of original component from other realms. Given that I'm not very familiar with the code base and I'm not confident to make deep refactoring with the time constraint. I think this should be a future development item after this zppy/zppy-interfaces release.

@chengzhuzhang
Copy link
Collaborator Author

completed by #26

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants