Skip to content

Commit 7471f96

Browse files
authored
Merge pull request #874 from marquiz/backports/release-0.7
[release-0.7] backport fixes from master
2 parents 7e481e2 + b37d638 commit 7471f96

File tree

3 files changed

+147
-8
lines changed

3 files changed

+147
-8
lines changed

docs/policy/balloons.md

Lines changed: 144 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,47 @@
33
## Overview
44

55
The balloons policy implements workload placement into "balloons" that
6-
are pools of exclusive CPUs. A balloon can be inflated and deflated,
7-
that is CPUs added and removed, based on the CPU resource requests of
8-
the workloads in the balloon. The policy supports both static and
9-
dynamically created and popped balloons. The balloons policy enables
10-
configuring balloon-specific CPU classes.
6+
are disjoint CPU pools. Balloons can be inflated and deflated, that is
7+
CPUs added and removed, based on the CPU resource requests of
8+
containers. Balloons can be static or dynamically created and
9+
destroyed. CPUs in balloons can be configured, for example, by setting
10+
min and max frequencies on CPU cores and uncore.
11+
12+
## How It Works
13+
14+
1. User configures balloon types from which the policy instantiates
15+
balloons.
16+
17+
2. A balloon has a set of CPUs and a set of containers that run on the
18+
CPUs.
19+
20+
3. Every container is assigned to exactly one balloon. A container is
21+
allowed to use all CPUs of its balloon and no other CPUs.
22+
23+
4. Every logical CPU belongs to at most one balloon. There can be CPUs
24+
that do not belong to any balloon.
25+
26+
5. The number of CPUs in a balloon can change during the lifetime of
27+
the balloon. If a balloon inflates, that is CPUs are added to it,
28+
all containers in the balloon are allowed to use more CPUs. If a
29+
balloon deflates, the opposite is true.
30+
31+
6. When a new container is created on a Kubernetes node, the policy
32+
first decides the type of the balloon that will run the
33+
container. The decision is based on annotations of the pod, or the
34+
namespace if annotations are not given.
35+
36+
7. Next the policy decides which balloon of the decided type will run
37+
the container. Options are:
38+
- an existing balloon that already has enough CPUs to run its
39+
current and new containers
40+
- an existing balloon that can be inflated to fit its current and
41+
new containers
42+
- new balloon.
43+
44+
9. When a CPU is added to a balloon or removed from it, the CPU is
45+
reconfigured based on balloon's CPU class attributes, or idle CPU
46+
class attributes.
1147

1248
## Deployment
1349

@@ -23,25 +59,127 @@ system of CRI-RM. See [setup and
2359
usage](../setup.md#setting-up-cri-resource-manager) for more details
2460
on managing the configuration.
2561

62+
### Parameters
63+
64+
Balloons policy parameters:
65+
66+
- `PinCPU` controls pinning a container to CPUs of its balloon. The
67+
default is `true`: the container cannot use other CPUs.
68+
- `PinMemory` controls pinning a container to the memories that are
69+
closest to the CPUs of its balloon. Pinning memory disallows using
70+
memory from other NUMA nodes.
71+
- `IdleCPUClass` specifies the CPU class of those CPUs that do not
72+
belong to any balloon.
73+
- `ReservedPoolNamespaces` is a list of namespaces (wildcards allowed)
74+
that are assigned to the special reserved balloon, that is, will run
75+
on reserved CPUs. This always includes the `kube-system` namespace.
76+
- `BalloonTypes` is a list of balloon type definitions. Each type can
77+
be configured with the following parameters:
78+
- `Name` of the balloon type. This is used in pod annotations to
79+
assign containers to balloons of this type.
80+
- `Namespaces` is a list of namespaces (wildcards allowed) whose
81+
pods should be assigned to this balloon type, unless overridden by
82+
pod annotations.
83+
- `MinBalloons` is the minimum number of balloons of this type that
84+
is always present, even if the balloons would not have any
85+
containers. The default is 0: if a balloon has no containers, it
86+
can be destroyed.
87+
- `MaxCPUs` specifies the maximum number of CPUs in any balloon of
88+
this type. Balloons will not be inflated larger than this. 0 means
89+
unlimited.
90+
- `MinCPUs` specifies the minimum number of CPUs in any balloon of
91+
this type. When a balloon is created or deflated, it will always
92+
have at least this many CPUs, even if containers in the balloon
93+
request less.
94+
- `CpuClass` specifies the name of the CPU class according to which
95+
CPUs of balloons are configured.
96+
- `PreferSpreadingPods`: if `true`, containers of the same pod
97+
should be spread to different balloons of this type. The default
98+
is `false`: prefer placing containers of the same pod to the same
99+
balloon(s).
100+
- `PreferPerNamespaceBalloon`: if `true`, containers in the same
101+
namespace will be placed in the same balloon(s). On the other
102+
hand, containers in different namespaces are preferrably placed in
103+
different balloons. The default is `false`: namespace has no
104+
effect on choosing the balloon of this type.
105+
- `PreferNewBalloons`: if `true`, prefer creating new balloons over
106+
placing containers to existing balloons. This results in
107+
preferring exclusive CPUs, as long as there are enough free
108+
CPUs. The default is `false`: prefer filling and inflating
109+
existing balloons over creating new ones.
110+
- `AllocatorPriority` (0: High, 1: Normal, 2: Low, 3: None). CPU
111+
allocator parameter, used when creating new or resizing existing
112+
balloons. If there are balloon types with pre-created balloons
113+
(`MinBalloons` > 0), balloons of the type with the highest
114+
`AllocatorPriority` are created first.
115+
116+
Related configuration parameters:
117+
- `policy.ReservedResources.CPU` specifies the (number of) CPUs in the
118+
special `reserved` balloon. By default all containers in the
119+
`kube-system` namespace are assigned to the reserved balloon.
120+
- `cpu.classes` defines CPU classes and their parameters (such as
121+
`minFreq`, `maxFreq`, `uncoreMinFreq` and `uncoreMaxFreq`).
122+
123+
### Example
124+
26125
Example configuration that runs all pods in balloons of 1-4 CPUs.
27126
```yaml
28127
policy:
29128
Active: balloons
30129
ReservedResources:
31130
CPU: 1
32131
balloons:
132+
PinCPU: true
133+
PinMemory: true
134+
IdleCPUClass: lowpower
33135
BalloonTypes:
34136
- Name: "quad"
35137
MinCpus: 1
36138
MaxCPUs: 4
139+
CPUClass: dynamic
37140
Namespaces:
38141
- "*"
142+
cpu:
143+
classes:
144+
lowpower:
145+
minFreq: 800
146+
maxFreq: 800
147+
dynamic:
148+
minFreq: 800
149+
maxFreq: 3600
150+
turbo:
151+
minFreq: 3000
152+
maxFreq: 3600
153+
uncoreMinFreq: 2000
154+
uncoreMaxFreq: 2400
39155
```
40156
41157
See the [sample configmap](/sample-configs/balloons-policy.cfg) for a
42158
complete example.
43159
44-
### Debugging
160+
## Assigning a Container to a Balloon
161+
162+
The balloon type of a container can be defined in pod annotations. In
163+
the example below, the first annotation sets the balloon type (`BT`)
164+
of a single container (`CONTAINER_NAME`). The last two annotations set
165+
the default balloon type for all containers in the pod.
166+
167+
```yaml
168+
balloon.balloons.cri-resource-manager.intel.com/container.CONTAINER_NAME: BT
169+
balloon.balloons.cri-resource-manager.intel.com/pod: BT
170+
balloon.balloons.cri-resource-manager.intel.com: BT
171+
```
172+
173+
If a pod has no annotations, its namespace is matched to the
174+
`Namespaces` of balloon types. The first matching balloon type is
175+
used.
176+
177+
If the namespace does not match, the container is assigned to the
178+
special `default` balloon, that means reserved CPUs unless `MinCPUs`
179+
or `MaxCPUs` of the `default` balloon type are explicitely defined in
180+
the `BalloonTypes` configuration.
181+
182+
## Metrics and Debugging
45183

46184
In order to enable more verbose logging and metrics exporting from the
47185
balloons policy, enable instrumentation and policy debugging from the

go.mod

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ require (
5353
github.com/davecgh/go-spew v1.1.1 // indirect
5454
github.com/docker/distribution v2.8.1+incompatible // indirect
5555
github.com/docker/go-units v0.4.0 // indirect
56-
github.com/emicklei/go-restful v2.9.5+incompatible // indirect
56+
github.com/emicklei/go-restful v2.16.0+incompatible // indirect
5757
github.com/euank/go-kmsg-parser v2.0.0+incompatible // indirect
5858
github.com/fsnotify/fsnotify v1.5.1 // indirect
5959
github.com/go-logr/logr v1.2.3 // indirect

go.sum

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -196,8 +196,9 @@ github.com/eapache/queue v1.1.0/go.mod h1:6eCeP0CKFpHLu8blIFXhExK/dRa7WDZfr6jVFP
196196
github.com/elazarl/goproxy v0.0.0-20180725130230-947c36da3153 h1:yUdfgN0XgIJw7foRItutHYUIhlcKzcSf5vDpdhQAKTc=
197197
github.com/elazarl/goproxy v0.0.0-20180725130230-947c36da3153/go.mod h1:/Zj4wYkgs4iZTTu3o/KG3Itv/qCCa8VVMlb3i9OVuzc=
198198
github.com/emicklei/go-restful v0.0.0-20170410110728-ff4f55a20633/go.mod h1:otzb+WCGbkyDHkqmQmT5YD2WR4BBwUdeQoFo8l/7tVs=
199-
github.com/emicklei/go-restful v2.9.5+incompatible h1:spTtZBk5DYEvbxMVutUuTyh1Ao2r4iyvLdACqsl/Ljk=
200199
github.com/emicklei/go-restful v2.9.5+incompatible/go.mod h1:otzb+WCGbkyDHkqmQmT5YD2WR4BBwUdeQoFo8l/7tVs=
200+
github.com/emicklei/go-restful v2.16.0+incompatible h1:rgqiKNjTnFQA6kkhFe16D8epTksy9HQ1MyrbDXSdYhM=
201+
github.com/emicklei/go-restful v2.16.0+incompatible/go.mod h1:otzb+WCGbkyDHkqmQmT5YD2WR4BBwUdeQoFo8l/7tVs=
201202
github.com/envoyproxy/go-control-plane v0.9.9-0.20210217033140-668b12f5399d/go.mod h1:cXg6YxExXjJnVBQHBLXeUAgxn2UodCpnH306RInaBQk=
202203
github.com/envoyproxy/protoc-gen-validate v0.1.0/go.mod h1:iSmxcyjqTsJpI2R4NaDN7+kN2VEUnK/pcBlmesArF7c=
203204
github.com/euank/go-kmsg-parser v2.0.0+incompatible h1:cHD53+PLQuuQyLZeriD1V/esuG4MuU0Pjs5y6iknohY=

0 commit comments

Comments
 (0)