|
10 | 10 | * DOC: teo-description
|
11 | 11 | *
|
12 | 12 | * The idea of this governor is based on the observation that on many systems
|
13 |
| - * timer events are two or more orders of magnitude more frequent than any |
14 |
| - * other interrupts, so they are likely to be the most significant cause of CPU |
15 |
| - * wakeups from idle states. Moreover, information about what happened in the |
16 |
| - * (relatively recent) past can be used to estimate whether or not the deepest |
17 |
| - * idle state with target residency within the (known) time till the closest |
18 |
| - * timer event, referred to as the sleep length, is likely to be suitable for |
19 |
| - * the upcoming CPU idle period and, if not, then which of the shallower idle |
20 |
| - * states to choose instead of it. |
| 13 | + * timer interrupts are two or more orders of magnitude more frequent than any |
| 14 | + * other interrupt types, so they are likely to dominate CPU wakeup patterns. |
| 15 | + * Moreover, in principle, the time when the next timer event is going to occur |
| 16 | + * can be determined at the idle state selection time, although doing that may |
| 17 | + * be costly, so it can be regarded as the most reliable source of information |
| 18 | + * for idle state selection. |
21 | 19 | *
|
22 |
| - * Of course, non-timer wakeup sources are more important in some use cases |
23 |
| - * which can be covered by taking a few most recent idle time intervals of the |
24 |
| - * CPU into account. However, even in that context it is not necessary to |
25 |
| - * consider idle duration values greater than the sleep length, because the |
26 |
| - * closest timer will ultimately wake up the CPU anyway unless it is woken up |
27 |
| - * earlier. |
| 20 | + * Of course, non-timer wakeup sources are more important in some use cases, |
| 21 | + * but even then it is generally unnecessary to consider idle duration values |
| 22 | + * greater than the time time till the next timer event, referred as the sleep |
| 23 | + * length in what follows, because the closest timer will ultimately wake up the |
| 24 | + * CPU anyway unless it is woken up earlier. |
28 | 25 | *
|
29 |
| - * Thus this governor estimates whether or not the prospective idle duration of |
30 |
| - * a CPU is likely to be significantly shorter than the sleep length and selects |
31 |
| - * an idle state for it accordingly. |
| 26 | + * However, since obtaining the sleep length may be costly, the governor first |
| 27 | + * checks if it can select a shallow idle state using wakeup pattern information |
| 28 | + * from recent times, in which case it can do without knowing the sleep length |
| 29 | + * at all. For this purpose, it counts CPU wakeup events and looks for an idle |
| 30 | + * state whose target residency has not exceeded the idle duration (measured |
| 31 | + * after wakeup) in the majority of relevant recent cases. If the target |
| 32 | + * residency of that state is small enough, it may be used right away and the |
| 33 | + * sleep length need not be determined. |
32 | 34 | *
|
33 | 35 | * The computations carried out by this governor are based on using bins whose
|
34 | 36 | * boundaries are aligned with the target residency parameter values of the CPU
|
|
39 | 41 | * idle state 2, the third bin spans from the target residency of idle state 2
|
40 | 42 | * up to, but not including, the target residency of idle state 3 and so on.
|
41 | 43 | * The last bin spans from the target residency of the deepest idle state
|
42 |
| - * supplied by the driver to infinity. |
| 44 | + * supplied by the driver to the scheduler tick period length or to infinity if |
| 45 | + * the tick period length is less than the target residency of that state. In |
| 46 | + * the latter case, the governor also counts events with the measured idle |
| 47 | + * duration between the tick period length and the target residency of the |
| 48 | + * deepest idle state. |
43 | 49 | *
|
44 | 50 | * Two metrics called "hits" and "intercepts" are associated with each bin.
|
45 | 51 | * They are updated every time before selecting an idle state for the given CPU
|
|
49 | 55 | * sleep length and the idle duration measured after CPU wakeup fall into the
|
50 | 56 | * same bin (that is, the CPU appears to wake up "on time" relative to the sleep
|
51 | 57 | * length). In turn, the "intercepts" metric reflects the relative frequency of
|
52 |
| - * situations in which the measured idle duration is so much shorter than the |
53 |
| - * sleep length that the bin it falls into corresponds to an idle state |
54 |
| - * shallower than the one whose bin is fallen into by the sleep length (these |
55 |
| - * situations are referred to as "intercepts" below). |
| 58 | + * non-timer wakeup events for which the measured idle duration falls into a bin |
| 59 | + * that corresponds to an idle state shallower than the one whose bin is fallen |
| 60 | + * into by the sleep length (these events are also referred to as "intercepts" |
| 61 | + * below). |
56 | 62 | *
|
57 | 63 | * In order to select an idle state for a CPU, the governor takes the following
|
58 | 64 | * steps (modulo the possible latency constraint that must be taken into account
|
59 | 65 | * too):
|
60 | 66 | *
|
61 |
| - * 1. Find the deepest CPU idle state whose target residency does not exceed |
62 |
| - * the current sleep length (the candidate idle state) and compute 2 sums as |
63 |
| - * follows: |
| 67 | + * 1. Find the deepest enabled CPU idle state (the candidate idle state) and |
| 68 | + * compute 2 sums as follows: |
64 | 69 | *
|
65 |
| - * - The sum of the "hits" and "intercepts" metrics for the candidate state |
66 |
| - * and all of the deeper idle states (it represents the cases in which the |
67 |
| - * CPU was idle long enough to avoid being intercepted if the sleep length |
68 |
| - * had been equal to the current one). |
| 70 | + * - The sum of the "hits" metric for all of the idle states shallower than |
| 71 | + * the candidate one (it represents the cases in which the CPU was likely |
| 72 | + * woken up by a timer). |
69 | 73 | *
|
70 |
| - * - The sum of the "intercepts" metrics for all of the idle states shallower |
71 |
| - * than the candidate one (it represents the cases in which the CPU was not |
72 |
| - * idle long enough to avoid being intercepted if the sleep length had been |
73 |
| - * equal to the current one). |
| 74 | + * - The sum of the "intercepts" metric for all of the idle states shallower |
| 75 | + * than the candidate one (it represents the cases in which the CPU was |
| 76 | + * likely woken up by a non-timer wakeup source). |
74 | 77 | *
|
75 |
| - * 2. If the second sum is greater than the first one the CPU is likely to wake |
76 |
| - * up early, so look for an alternative idle state to select. |
| 78 | + * 2. If the second sum computed in step 1 is greater than a half of the sum of |
| 79 | + * both metrics for the candidate state bin and all subsequent bins(if any), |
| 80 | + * a shallower idle state is likely to be more suitable, so look for it. |
77 | 81 | *
|
78 |
| - * - Traverse the idle states shallower than the candidate one in the |
| 82 | + * - Traverse the enabled idle states shallower than the candidate one in the |
79 | 83 | * descending order.
|
80 | 84 | *
|
81 | 85 | * - For each of them compute the sum of the "intercepts" metrics over all
|
82 | 86 | * of the idle states between it and the candidate one (including the
|
83 | 87 | * former and excluding the latter).
|
84 | 88 | *
|
85 |
| - * - If each of these sums that needs to be taken into account (because the |
86 |
| - * check related to it has indicated that the CPU is likely to wake up |
87 |
| - * early) is greater than a half of the corresponding sum computed in step |
88 |
| - * 1 (which means that the target residency of the state in question had |
89 |
| - * not exceeded the idle duration in over a half of the relevant cases), |
90 |
| - * select the given idle state instead of the candidate one. |
| 89 | + * - If this sum is greater than a half of the second sum computed in step 1, |
| 90 | + * use the given idle state as the new candidate one. |
91 | 91 | *
|
92 |
| - * 3. By default, select the candidate state. |
| 92 | + * 3. If the current candidate state is state 0 or its target residency is short |
| 93 | + * enough, return it and prevent the scheduler tick from being stopped. |
| 94 | + * |
| 95 | + * 4. Obtain the sleep length value and check if it is below the target |
| 96 | + * residency of the current candidate state, in which case a new shallower |
| 97 | + * candidate state needs to be found, so look for it. |
93 | 98 | */
|
94 | 99 |
|
95 | 100 | #include <linux/cpuidle.h>
|
|
0 commit comments