-
Notifications
You must be signed in to change notification settings - Fork 446
Description
We observed a significant slowdown in RRMxx tests related to output writing (machine: dane). Two tests, named O34 and O38, differ only in their yaml files. The SYPD are 0.066 for O34 vs. 0.24 for O38.
Figures 1 and 2 show the timers for the O34 and O38 tests, respectively. Note that the walltime needs a scaling as O34 is a 8-day test and O38 is a 5-day test. The yamls with notably high walltime are highlighted in red. The summaries for "write_total", "run_output_streams", and "horiz_remap" are shown in bold. The slowest streams in O34:
- the "Betts" yamls include horiz_remapper (pg2 -> 1x1 highres domain) + high-frequency 2D/3D outputs + conditional sampling
- the "more" yaml includes horiz_remapper (pg2 -> 1x1 highres domain) + high-frequency 2D/3D outputs
- the "coarse" yaml includes horiz_remapper (pg2 -> ne30 global) + 2D/3D outputs
Seems that the combination of horiz_remapper and high-frequency outputs is likely the main reason for the slowdown. Conditional sampling could be another contributing factor, as 1hI more is much faster than 1hI Betts in O38, despite that the former includes more 2D and 3D variables.
Also worth noting: the 1hA ("hourly average") Betts output in O34 has a comparable wallclock to the 1step Betts, and both are much slower than the 1hI ("hourly instant") Betts in O38. These Betts yamls share the same var list except the averaging_type and frequency.
yaml files can be found here:
- O34: https://portal.nersc.gov/cfs/e3sm/zhang73/yaml/debug/2240x1_ndaysx8_E3SMv1SSP585-UVTQ2d-s20151001-O34Betts/data/
- O38: https://portal.nersc.gov/cfs/e3sm/zhang73/yaml/debug/2240x1_ndaysx5_E3SMv1SSP585-UVTQ2d-s20151001-O38-5minAsite-rad3/data/
Do you have any ideas regarding this behavior? @AaronDonahue @bartgol It may not be a high-priority issue, since 1hI output is sufficient for current needs, but it may be worth noting.
=============

