Skip to content

Commit 299ce0c

Browse files
authored
Merge pull request #9 from E3SM-Project/ambrad/update-4seasons-paper-fig
Update 4seasons paper fig.
2 parents 97b9b69 + ca656f0 commit 299ce0c

10 files changed

+2339
-14
lines changed

screamv1-frontier-feb2023/data/frontier-v1-scream-gb-o3-ne1024-nnodes1024.ne1024pg2_ne1024pg2.F2010-SCREAMv1-1389463.230729-075018-model_timing_stats

Lines changed: 461 additions & 0 deletions
Large diffs are not rendered by default.

screamv1-frontier-feb2023/data/frontier-v1-scream-gb-o3-ne1024-nnodes2048.ne1024pg2_ne1024pg2.F2010-SCREAMv1-1389462.230729-021333-model_timing_stats

Lines changed: 461 additions & 0 deletions
Large diffs are not rendered by default.

screamv1-frontier-feb2023/data/frontier-v1-scream-gb-o3-ne1024-nnodes4096.ne1024pg2_ne1024pg2.F2010-SCREAMv1-1389464.230729-022636-model_timing_stats

Lines changed: 461 additions & 0 deletions
Large diffs are not rendered by default.

screamv1-frontier-feb2023/data/frontier-v1-scream-gb-o3-ne1024-nnodes512.ne1024pg2_ne1024pg2.F2010-SCREAMv1-1389465.230729-055449-model_timing_stats

Lines changed: 461 additions & 0 deletions
Large diffs are not rendered by default.

screamv1-frontier-feb2023/data/frontier-v1-scream-gb-o3-ne1024-nnodes8192.ne1024pg2_ne1024pg2.F2010-SCREAMv1-1389460.230730-003528-model_timing_stats

Lines changed: 461 additions & 0 deletions
Large diffs are not rendered by default.

screamv1-frontier-feb2023/figs/figs.hy

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
(assoc matplotlib.rcParams "savefig.dpi" 300))
88

99
(defn get-context [&optional [eul False] [threaded True]]
10-
(sv prefix (+ "frontier-v1-scaling1-rocm5?-"
10+
(sv prefix (+ "frontier-v1-*-nnodes"
1111
(if eul "eul-" "")
1212
(if threaded "" "nothrd-"))
1313
timers (, "CPL:RUN_LOOP" "CPL:ATM_RUN" "a:tl-sc prim_run_subcycle_c"
@@ -20,7 +20,7 @@
2020
{:prefix prefix
2121
:machine-name "Frontier"
2222
:compset "ne1024pg2_ne1024pg2.F2010-SCREAMv1"
23-
:glob-data (+ "../data/" prefix "nnodes*-model_timing_stats")
23+
:glob-data (+ "../data/" prefix "*-model_timing_stats")
2424
:timers0 (cut timers 0 1)
2525
:timers1 (cut timers 0 2)
2626
:timers2 (cut timers 0 5)
@@ -195,14 +195,14 @@
195195
(if cpu-sizing
196196
(cond [(= timer-set :timersa)
197197
(sv y [20 30 40 50 60 70 80 90 100 125 150 175 200 250 300 350
198-
400 450 500 550 600]
198+
400 450 500 550 600 700]
199199
yscale 90)
200-
(pl.ylim (, 20 600))]
200+
(pl.ylim (, 20 750))]
201201
[(= timer-set :timersbcut)
202202
(sv y [80 90 100 125 150 175 200 250 300 350 400 450 500 600 700 800 900
203203
1000 1200 1400 1600 1800 2000 2500 3000 3500]
204204
yscale 500)
205-
(pl.ylim (, 80 3500))])
205+
(pl.ylim (, 80 3800))])
206206
(cond [(= timer-set :timers2)
207207
(sv y [50 100 150 200 300 400 500 600 700 800 900 1000
208208
1200 1400 1600 1800]

screamv1-frontier-feb2023/figs/perf-table.tex

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
Frontier & 512 & 58.3 & 64.8 & 82.3 & 288.8 & 306.0 & 337.6 \\
55
& 1024 & 103.7 & 118.3 & 149.9 & 562.0 & 596.0 & 620.5 \\
66
& 2048 & 175.6 & 197.6 & 252.1 & 1012.0 & 1069.2 & 1043.6 \\
7-
& 4096 & 261.5 & 311.2 & 395.4 & 1776.0 & 1568.6 & 1786.4 \\
8-
& 8192 & 297.4 & 384.1 & 550.6 & 3171.3 & 2820.8 & 2991.9 \\
7+
& 4096 & 283.2 & 332.8 & 424.0 & 1760.2 & 1886.0 & 1693.5 \\
8+
& 8192 & 419.5 & 514.5 & 645.3 & 3173.7 & 3349.6 & 2911.1 \\
99
\hline
1010
Summit & 1024 & 70.0 & 78.2 & 100.8 & 401.4 & 465.9 & 382.3 \\
1111
& 2048 & 109.4 & 132.9 & 169.4 & 744.1 & 870.9 & 679.3 \\
46 Bytes
Binary file not shown.
9.3 KB
Loading

screamv1-frontier-feb2023/readme.txt

Lines changed: 27 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
1-
This directory contains data and code for the Frontier window 10-20 Feb 2023.
1+
This directory contains data and code for the Frontier window 10-20 Feb 2023,
2+
plus later additional Frontier runs.
23

34
The run*sh scripts are one-off run scripts. They take as an argument the number
45
of nodes to run on. run-rocm54.sh and run-rocm54-eul.sh use the tag
@@ -7,7 +8,11 @@ run-rocm51.sh uses the tag
78
https://github.com/E3SM-Project/scream/releases/tag/archive%2Fscreamv1-frontier-feb2023-rocm51
89
The rocm/5.4, cce/15.0.0 configuration has issues with BLAS in the LND
910
component. The rocm/5.1, cce/14.0.2 configuration seems fine. As a result, our
10-
figures use the rocm51-annotated data.
11+
figures use the rocm51-annotated data. Later, we were able to redo the
12+
large-scale simulations. These use the branch
13+
https://github.com/E3SM-Project/scream/tree/sarats/frontier-gb which is
14+
essentially the same rocm/5.1 configuration. These are the
15+
frontier-v1-scream-gb-o3-ne1024 data sets.
1116

1217
jobmonitor.py is a tool to monitor a single job. If e3sm.exe terminates but the
1318
job hangs, jobmonitor.py will kill it, thus minimizing hanging time.
@@ -28,8 +33,23 @@ The figs/ directory contains hy (version 0.20 running on any python3) code to
2833
summarize and plot the data.
2934

3035
The figures use the following subset of the data:
31-
frontier-v1-scaling1-rocm51-nnodes512.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.1271946.230210-212030-model_timing_stats
32-
frontier-v1-scaling1-rocm51-nnodes1024.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.1273705.230216-000516-model_timing_stats
33-
frontier-v1-scaling1-rocm51-nnodes2048.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.1273525.230215-230716-model_timing_stats
34-
frontier-v1-scaling1-rocm54-nnodes4096.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.1272650.230213-074922-model_timing_stats
35-
frontier-v1-scaling1-rocm54-nnodes8192.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.1272541.230212-204538-model_timing_stats
36+
Frontier:
37+
frontier-v1-scaling1-rocm51-nnodes512.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.1271946.230210-212030-model_timing_stats
38+
frontier-v1-scaling1-rocm51-nnodes1024.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.1273705.230216-000516-model_timing_stats
39+
frontier-v1-scaling1-rocm51-nnodes2048.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.1273525.230215-230716-model_timing_stats
40+
frontier-v1-scream-gb-o3-ne1024-nnodes4096.ne1024pg2_ne1024pg2.F2010-SCREAMv1-1389464.230729-022636-model_timing_stats
41+
frontier-v1-scream-gb-o3-ne1024-nnodes8192.ne1024pg2_ne1024pg2.F2010-SCREAMv1-1389460.230730-003528-model_timing_stats
42+
Summit:
43+
screamv1-summit-oct2022/data/scream-v1-scaling2-nnodes1024.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.2495303.221008-023937-model_timing_stats
44+
screamv1-summit-oct2022/data/scream-v1-scaling2-nnodes2048.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.2495304.221009-093803-model_timing_stats
45+
screamv1-summit-oct2022/data/scream-v1-scaling2-nnodes3072.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.2495590.221008-072336-model_timing_stats
46+
screamv1-summit-oct2022/data/scream-v1-scaling2-nnodes4096.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.2497059.221010-173053-model_timing_stats
47+
screamv1-summit-oct2022/data/scream-v1-scaling2-nnodes4608.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.2495935.221009-090433-model_timing_stats
48+
Perlmutter CPU
49+
screamv1-pm-cpu-mar2023/data/pm-cpu-v1-scaling1-nnodes1536.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.6746630.230401-112731-model_timing_stats
50+
screamv1-pm-cpu-mar2023/data/pm-cpu-v1-scaling1-nnodes2048.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.6907074.230404-030231-model_timing_stats
51+
Perlmutter GPU
52+
screamv1-pm-gpu-mar2023/data/pm-gpu-v1-scaling1-nnodes384.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.8132452.230425-101318-model_timing_stats
53+
screamv1-pm-gpu-mar2023/data/pm-gpu-v1-scaling1-nnodes512.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.5993462.230309-205142-model_timing_stats
54+
screamv1-pm-gpu-mar2023/data/pm-gpu-v1-scaling1-nnodes1024.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.8421021.230505-110231-model_timing_stats
55+
screamv1-pm-gpu-mar2023/data/pm-gpu-v1-scaling1-nnodes1536.ne1024pg2_ne1024pg2.F2010-SCREAMv1-timing.8168038.230430-014707-model_timing_stats

0 commit comments

Comments
 (0)