File tree 2 files changed +61
-0
lines changed
2 files changed +61
-0
lines changed Original file line number Diff line number Diff line change @@ -6,3 +6,53 @@ communication, so can easily be overloaded.
6
6
7
7
The application is installed as a python module with a shell
8
8
script wrapper. The only requirement is MPI4PY.
9
+
10
+ ## Background
11
+
12
+ Amdahl's law posits that some unit of work comprises a proportion * p* that
13
+ benefits from parallel resources, and a proportion * s* that is constrained to
14
+ execute in serial. The theoretical maximum speedup achievable for such a
15
+ workload is
16
+
17
+ ``` output
18
+ 1
19
+ S = -------
20
+ s + p/N
21
+ ```
22
+
23
+ where * S* is the speedup relative to performing all of the work in serial and
24
+ * N* is the number of parallel workers. A plot of * S* vs. * N* ought to look like
25
+ this, for * p* =0.8:
26
+
27
+ ``` output
28
+ 5┬─────────────────────────────────────·──────────────────┐
29
+ │ · │
30
+ │ · │
31
+ │ · │
32
+ 4┤ · │
33
+ │ · │
34
+ S │ · *
35
+ p │ · * * │
36
+ e │ · * │
37
+ e 3┤ · * │
38
+ d │ · * │
39
+ u │ · * │
40
+ p │ · │
41
+ │ ·* |
42
+ 2┤ · │
43
+ │ * · │
44
+ │ · │
45
+ │ · │
46
+ │ · │
47
+ 1*─────┬──────┬─────┬─────┬──────┬─────┬─────┬──────┬─────┤
48
+ 1 2 3 4 5 6 7 8 9 10
49
+ Workers
50
+ ```
51
+
52
+ "Ideal scaling" (* p* =1) is would be the line * y* = * x* (or * S* = * N* ),
53
+ represented here by the dotted line.
54
+
55
+ This graph shows there is a speed limit for every workload, and diminishing
56
+ returns on throwing more parallel processors at a problem. It is worth running
57
+ a "scaling study" to assess how far away that speed limit might be for the
58
+ given task.
Original file line number Diff line number Diff line change 4
4
5
5
from mpi4py import MPI
6
6
7
+ """
8
+ Gather timing data in order to plot speedup *S* vs. number of cores *N*,
9
+ which should follow Amdahl's Law:
10
+
11
+ 1
12
+ S = -------
13
+ s + p/N
14
+
15
+ where *s* is the serial proportion of the total work and *p* the
16
+ parallelizable proportion.
17
+ """
7
18
8
19
def do_work (work_time = 30 , parallel_proportion = 0.8 , comm = MPI .COMM_WORLD ):
9
20
# How many MPI ranks (cores) are we?
You can’t perform that action at this time.
0 commit comments