-
Notifications
You must be signed in to change notification settings - Fork 640
remove unneeded fields for dev/ufs-weather-model #1317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove unneeded fields for dev/ufs-weather-model #1317
Conversation
* remove wav_grdout routine, now moved into wav_history_mod
* remove more cap stuff from w3iorsmd. only ww3 native filenaming is possible w/ binary restarts * remove ifdef w3_ascii from w3wavemd, since the ascii commit is not yet present in mesh cap branch * clean up config variable logic for filenaming
* nrqrs can be non-zero if also using the restart from binary option
* flpart needs to be set for either historync or not
* move restart and history nc blocks outside of ww3 time testing block. * add log print messages for reading and writing restarts
* get logging working correctly for ufs * fix noclobber variable and file name in wav_history * clarify some comments
* make binary history files match when restartnc=true * rework wav_restart_mod, which originally was designed to be able to read and write restarts for testing purposes from inside wav_comp_nuopc.
* verboselog is true by default, but can be set false by config * the header text for ww3 logging as it steps through time is now turned off in w3init and placed into the mesh cap. this allows the mesh cap to order the logging correctly
* move block where addrstflds was set to before call to w3init since restarts are read in w3init * ensure that if nml file lacks a specification of extra fields, the default value of "unset" will not be returned as a field name
* only ice is added for now * tab cleanup in w3grid
* need to send explicit array bounds for ice array since it is 0:nsea * all baselines b4b against f9531d0
* intialize floutg and floutg2 which are can be unintialized when waves are in slow loop and historync is true
* local arrays dimension nx or greater are locally allocated * deallocations added, even though not required for user clarity
…o feature/pio4ww3
* global_input and _output can be allocatable
This reverts commit 7244b76.
6ffebef to
a64f2dd
Compare
* minimally allocate fields used in w3updtmd when using cmeps * use fields provided by cmeps w/o fake time-interpolation
5f370a4 to
525e749
Compare
…nto feature/nodupfields
|
An update on this PR; I've known for quite some time that I was seeing non-debug gnu test failures with this branch. GNU +debug itself passes, as do all the INTEL,INTEL+DEBUG. I've just tested on Hera w/ the newly re-enabled GNU tests and I get the same result. On Hera, these two tests fail comparison. The same two tests pass on Hercules. EDIT: The Hera+GNU tests do pass against a generated baselines. Since the DEBUG tests pass against the existing baseline, I think the Hera failures are due to a combination of compiler, along w/ order-of-operation differences in the PR branch. |
|
@DeniseWorthen - I'll run these again when it's closer to time, so I'm not putting the full logs but the standalone regtest diffs: |
|
@JessicaMeixner-NOAA One thing I noticed (but did not make a change for) is that there seem to be some missing OMP statements in w3updtmd.F90 Lines 999 to 1022 in f27fe03
and Lines 2697 to 2707 in f27fe03
|
|
Thanks for pointing that out. I made an issue: #1444 so we can make sure to go back to that. |
JessicaMeixner-NOAA
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**********************************************************************
********************* non-identical cases ****************************
**********************************************************************
mww3_test_03/./work_PR1_MPI_e (1 files differ)
mww3_test_03/./work_PR3_UQ_MPI_e_c (1 files differ)
mww3_test_03/./work_PR3_UNO_MPI_e (1 files differ)
mww3_test_03/./work_PR2_UQ_MPI_e (1 files differ)
mww3_test_03/./work_PR2_UNO_MPI_d2 (15 files differ)
mww3_test_03/./work_PR1_MPI_d2 (10 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2_c (16 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2_c (15 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2 (15 files differ)
mww3_test_03/./work_PR2_UQ_MPI_d2 (15 files differ)
mww3_test_03/./work_PR3_UQ_MPI_e (1 files differ)
mww3_test_03/./work_PR3_UNO_MPI_e_c (1 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2 (15 files differ)
mww3_test_09/./work_MPI_ASCII (0 files differ)
ww3_tp2.10/./work_MPI_OMPH (7 files differ)
ww3_tp2.16/./work_MPI_OMPH (4 files differ)
ww3_tp2.6/./work_ST4_ASCII (0 files differ)
ww3_ufs1.3/./work_a (2 files differ)
matrixCompFull.txt
matrixCompSummary.txt
matrixDiff.txt
Thank you @DeniseWorthen !!
9939046
into
NOAA-EMC:dev/ufs-weather-model


Pull Request Summary
Removes un-needed allocations and time-interpolation of import fields when CMEPS is used. Note this PR is built on top of the PIO PR. Changes are minimal when compared to that PR.
Description
Introduces a logical flag (
use_cmeps) which is false by default. When true, arrays that are not needed at the global domain resolution are minimally allocated and time-interpolation w/in the update routines is skipped.To test the impact, three cases were run using the same DATM-S2SW configuration used for testing the PIO work on Gaea. Each case was run out 48 hours on 816 WAV tasks, with WW3 restarts written every 2 hours.
The
ProfileMemoryconfiguration flag was turned on and memory statistics for the WW3 PETs were pulled out for every 50th PE (18 total traces). Three values were plotted:VmPeak,VmRSSandCommittedPercent. The first is the max resident size, the second is the mean resident size and the third is (as defined here) an interesting number I happened to notice in the output. I'm not sure of this last value, but it seems relevant---for example, if I use too few tasks for WAV, the model crashes and I also see values likeCommittedPercent: 166.939 %.Each of the following figures show the current dev/ufs-weather-model branch (top), the feature/pio4ww3 branch (middle) and this feature branch (bottom):
VmPeak:
VmRss:
Note here that the highest trace on the top figure shows the values for the last (
naprst) PET.CommittedPercent:
Note here that the lowest trace on all figures is the first PET.
My interpretation of these results is that a) the PIO feature itself reduces memory usage by reducing the high-memory cost of the
naprsttask and b) this feature has a measurable impact on the memory footprint of the model. However, no consistent impact on model execution time was found in my limited testing.Issue(s) addressed
Commit Message
Check list
Testing
I ran this PR against a baseline created with PR #1303. All baselines passed.