You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Recover updates to docs and release notes
* Add more features to release notes
Change-Id: I1783a3e890da9a0599b83452e77d956e58d83ec6
* Better wording
Change-Id: Idb8737e5ba5f55863438d601743a15e5967f6ea1
Copy file name to clipboardExpand all lines: docs/concepts/cluster_queue.md
+40-29Lines changed: 40 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,9 +1,9 @@
1
1
# Cluster Queue
2
2
3
3
A ClusterQueue is a cluster-scoped object that governs a pool of resources
4
-
such as CPU, memory and hardware accelerators. A `ClusterQueue` defines:
5
-
- The [resource _flavors_](#resourceflavor-object) that it manages, with usage
6
-
limits and order of consumption.
4
+
such as CPU, memory, and hardware accelerators. A ClusterQueue defines:
5
+
- The [resource _flavors_](#resourceflavor-object) that the ClusterQueue manages,
6
+
with usage limits and order of consumption.
7
7
- Fair sharing rules across the tenants of the cluster.
8
8
9
9
Only [cluster administrators](/docs/tasks#batch-administrator) should create `ClusterQueue` objects.
@@ -39,29 +39,29 @@ You can specify the quota as a [quantity](https://kubernetes.io/docs/reference/k
39
39
## Resources
40
40
41
41
In a ClusterQueue, you can define quotas for multiple [compute resources](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#resource-types)
42
-
(cpu, memory, GPUs, etc.).
42
+
(CPU, memory, GPUs, etc.).
43
43
44
-
For each resource, you can define quotas for multiple _flavors_. A
45
-
flavor represents different variations of a resource. The variations can be
46
-
defined in a [ResourceFlavor object](#resourceflavor-object).
44
+
For each resource, you can define quotas for multiple _flavors_.
45
+
Flavors represent different variations of a resource (for example, different GPU
46
+
models). A flavor is defined using a [ResourceFlavor object](#resourceflavor-object).
47
47
48
-
In a process called [admission](.#admission), Kueue assigns
49
-
[Workload pod sets](workload.md#pod-sets) a flavor for each resource it requests.
48
+
In a process called [admission](.#admission), Kueue assigns to the
49
+
[Workload pod sets](workload.md#pod-sets) a flavor for each resource the pod set
50
+
requests.
50
51
Kueue assigns the first flavor in the ClusterQueue's `.spec.resources[*].flavors`
51
52
list that has enough unused `min` quota in the ClusterQueue or the
52
53
ClusterQueue's [cohort](#cohort).
53
54
54
55
### Codepedent resources
55
56
56
-
It is possible that multiple resources are tied to the same flavors. This is
57
-
typical for `cpu` and `memory`, where the flavors are generally tied to a
58
-
machine family or availability guarantees.
57
+
It is possible that multiple resources in a ClusterQueue have the same flavors.
58
+
This is typical for `cpu` and `memory`, where the flavors are generally tied to
59
+
a machine family or VM availability policies. When two or more resources in a
60
+
ClusterQueue match their flavors, they are said to be codependent resources.
59
61
60
-
If this is the case, the resources in the ClusterQueue must list the same
61
-
flavors in the same order. When two or more resources match their flavors,
62
-
they are said to be codependent. During admission, for each pod set in a
63
-
Workload, Kueue assigns the same flavor to the codependent resources that the
64
-
pod set requests.
62
+
To manage codependent resources, you should list the flavors in the ClusterQueue
63
+
resources in the same order. During admission, for each pod set in a Workload,
64
+
Kueue assigns the same flavor to the codependent resources that the pod set requests.
65
65
66
66
An example of a ClusterQueue with codependent resources looks like the following:
67
67
@@ -150,8 +150,8 @@ Resources in a cluster are typically not homogeneous. Resources could differ in:
150
150
- architecture (ex: x86 vs ARM CPUs)
151
151
- brands and models (ex: Radeon 7000 vs Nvidia A100 vs T4 GPUs)
152
152
153
-
A ResourceFlavor is an object that represents these variations and allows you
154
-
to associate them with node labels and taints.
153
+
A ResourceFlavor is an object that represents these resource variations and
154
+
allows you to associate them with node labels and taints.
155
155
156
156
**Note**: If your cluster is homogeneous, you can use an [empty ResourceFlavor](#empty-resourceflavor)
157
157
instead of adding labels to custom ResourceFlavors.
@@ -197,8 +197,8 @@ steps:
197
197
198
198
For example, for a [batch/v1.Job](https://kubernetes.io/docs/concepts/workloads/controllers/job/),
199
199
Kueue adds the labels to the `.spec.template.spec.nodeSelector` field. This
200
-
guarantees that the workload Pods run on the nodes associated to the flavor
201
-
that Kueue decided that the workload should use.
200
+
guarantees that the Workload's Pods can only be scheduled on the nodes
201
+
targeted by the flavor that Kueue assigned to the Workload.
202
202
203
203
### ResourceFlavor taints
204
204
@@ -208,7 +208,7 @@ with taints.
208
208
Taints on the ResourceFlavor work similarly to [node taints](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/).
209
209
For Kueue to admit a workload to use the ResourceFlavor, the PodSpecs in the
210
210
workload should have a toleration for it. As opposed to the behavior for
211
-
[ResourceFlavor labels](#resourceflavor-labels), Kueue will not add tolerations
211
+
[ResourceFlavor labels](#resourceflavor-labels), Kueue does not add tolerations
212
212
for the flavor taints.
213
213
214
214
### Empty ResourceFlavor
@@ -238,16 +238,27 @@ ClusterQueue.
238
238
239
239
### Flavors and borrowing semantics
240
240
241
-
When borrowing, Kueue satisfies the following admission semantics:
241
+
When a ClusterQueue is part of a cohort, Kueue satisfies the following admission
242
+
semantics:
242
243
243
244
- When assigning flavors, Kueue goes through the list of flavors in the
244
245
ClusterQueue's `.spec.resources[*].flavors`. For each flavor, Kueue attempts
245
-
to fit a Workload's pod set using the `min` quota of the ClusterQueue or the
246
-
unused `min` quota of other ClusterQueues in the cohort, up to the `max` quota
247
-
of the ClusterQueue. If the workload doesn't fit, Kueue proceeds evaluating the next
248
-
flavor in the list.
249
-
- A ClusterQueue can only borrow quota of flavors it defines and it can only
250
-
borrow quota for one flavor.
246
+
to fit a Workload's pod set according to the quota defined in the
247
+
ClusterQueue for the flavor and the unused quota in the cohort.
248
+
If the workload doesn't fit, Kueue evaluates the next flavor in the list.
249
+
- A Workload's pod set resource fits in a flavor defined for a ClusterQueue
250
+
resource if the sum of requests for the resource:
251
+
1. Is less than or equal to the unused `.quota.min` for the flavor in the
252
+
ClusterQueue; or
253
+
2. Is less than or equal to the sum of unused `.quota.min` for the flavor in
254
+
the ClusterQueues in the cohort, and
255
+
3. Is less than or equal to the unused `.quota.max` for the flavor in the
256
+
ClusterQueue.
257
+
In Kueue, when (2) and (3) are satisfied, but not (1), this is called
258
+
_borrowing quota_.
259
+
- A ClusterQueue can only borrow quota for flavors that the ClusterQueue defines.
260
+
- For each pod set resource in a Workload, a ClusterQueue can only borrow quota
0 commit comments