@@ -76,11 +76,11 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
76
76
77
77
Extended resource provides a simple, concise approach to describe resource
78
78
capacity, and resource consumption. In constrast, Dynamic Resource
79
- Allocation (DRA) provides a more expressive, flexible, powerful approach, yet
79
+ Allocation (DRA) provides a more expressive, flexible approach, yet
80
80
more complicated, and harder to use.
81
81
82
82
This KEP provides a solution to enable cluster administrators to advertise the
83
- dynamic resources (in ` ResourceSlice ` ) as extended resource in ` DeviceClass ` ,
83
+ dynamic resources (in ` ResourceSlice ` ) as extended resource via ` DeviceClass ` .
84
84
and enables the application developers, and operators to continue using
85
85
extended resource to request for such resources.
86
86
@@ -211,8 +211,8 @@ non-goals of this KEP.
211
211
212
212
### Goals
213
213
214
- * Introduce the ability for DRA to advertise extended resources, and for the
215
- scheduler to consider them for allocation.
214
+ * Introduce the ability to advertise DRA resources as extended resources, and
215
+ for the scheduler to consider them for allocation.
216
216
217
217
* Enable application operators to use the existing extended resource request in
218
218
pod spec to request for DRA resources.
@@ -224,16 +224,15 @@ non-goals of this KEP.
224
224
* Device plugin API must not change. The existing device plugin drivers must
225
225
continue working without change.
226
226
227
- * DRA driver API change must be minimal, if there is any. Core kubernetes
228
- (kube-scheduler, kubelet) is preferred over DRA driver for any change needed
229
- to support the feature.
227
+ * DRA driver API must not change. Core kubernetes (kube-scheduler, kubelet) is
228
+ preferred over DRA driver for any change needed to support the feature.
230
229
231
230
### Non-Goals
232
231
233
232
* Minimize kubelet or kube-scheduler changes. The feature requires necessary
234
233
changes in both scheduling and actuation.
235
234
236
- * Keep advertising pod .status.Capacity for extended resources backed by DRA.
235
+ * Keep advertising ` node .status.Capacity` for extended resources backed by DRA.
237
236
It is used for extended resources backed by device plugin only.
238
237
239
238
# # Proposal
@@ -242,51 +241,48 @@ The basic idea is the following:
242
241
243
242
1. Introduce `extended resource backed by DRA`. It is like the current extended
244
243
resource backed by device plugin, in that, it has a string name, and a
245
- discrete countable quantity. Its capacity is provided through dynamic
246
- resource `ResourceSlice`, its consumption is specified through pod's extended
244
+ discrete countable quantity. Its capacity is provided through DRA
245
+ ` ResourceSlice` , its consumption is specified through pod's extended
247
246
resource request.
248
247
1. Introduce a field `ExtendedResourceName` to `DeviceClass` to allow cluster
249
248
administrators to advertise certain class of devices as extended resource.
250
- 1. Alternatively, introduce a field `ExtendedResourceName` to `ResourceSlice`
251
- and `Device` to allow cluster administrators to configure DRA device driver
252
- to advertise certain devices as extended resource.
253
249
1. Introduce a special `ResourceClaim` object to keep track of device allocations
254
250
for all extended resource requests backed by DRA for a pod. kube-scheduler
255
251
uses DRA scheduling algorithm to fit pod's extended resource request to a
256
- node that advertises the extended resource in DRA `ResorceSlice` or traditional
257
- extended resources. When using DRA devices, it creates a special `ResourceClaim`
258
- for the pod with the allocation result recording which devices were picked. More
259
- details on this special `ResourceClaim` follow below. When using extended
260
- resources advertised for a node by device plugin, the existing resource
261
- tracking reserves them.
252
+ node that advertises the extended resource in DRA `ResorceSlice` or extended
253
+ resources backed by device plugin . When using DRA devices, it creates a
254
+ special `ResourceClaim` for the pod with the allocation result recording
255
+ which devices were picked. More details on this special `ResourceClaim`
256
+ follow below. When using extended resources advertised for a node by device
257
+ plugin, the existing resource tracking reserves them.
262
258
1. kubelet asks DRA driver to prepare devices in the special `ResourceClaim`,
263
- and pass the devices to containers with the extended resource requests.
259
+ and pass the devices to containers in a pod with the extended resource requests.
264
260
265
261
Some quick clarifications around the basic concepts : extended resource backed by
266
262
device plugin, extended resource backed by DRA, and dynamic resource.
267
263
268
264
* extended resource backed by device plugin uses pod's
269
265
spec.containers[].resources.requests to request for resources, it consumes the capacity
270
- from node's status.capacity. It is of type : string, int64
266
+ from node's status.capacity. It is of type ( string, int64)
271
267
* dynamic resource uses `ResourceClaim` to request for resources, and
272
268
` ResourceSlice` to provide resource capacity. A pod asks for resources through
273
269
resource claim requests in pod's spec.resources.claims. Dynamic resource type
274
270
is described in resource slice, simply speaking, it is a list of devices, with
275
271
each device being described as structured parameters.
276
272
* extended resource backend by DRA is a combination of the two above. It uses pods'
277
273
spec.containers[].resources.requests to request for resources, and uses
278
- `ResourceSlice` to provide resource capacity. Hence, it is of type : string,
279
- int64 on the consumption side, and list of devices with a common
274
+ ` ResourceSlice` to provide resource capacity. Hence, it is of type ( string, int64)
275
+ on the consumption side, and list of devices with a common
280
276
` ExtendedResourceName` on the capacity side.
281
277
282
278
With these additions in place, the DRA devices can be consumed by extended resource
283
279
requests, or by DRA resouce claims. The scheduler has everything it needs to support
284
280
the dynamic allocation of devices to requests made through extended resource and
285
281
resource claims. No static partition of resources between extended resources and
286
282
resource claims is needed. The kubelet and DRA driver has everything they need
287
- to admit and pass the allocated devices to the pod to run.
283
+ to admit a pod and pass the allocated devices to the containers in the pod to run.
288
284
289
- Note the following cluster setup requirement and constraint :
285
+ Note the following cluster setup configuration and constraint :
290
286
291
287
* One node in cluster has a extended resource backed by DRA, and another node in the
292
288
cluster has the same named extended resource backend by device plugin.
@@ -339,53 +335,6 @@ type DeviceClassSpec struct {
339
335
}
340
336
```
341
337
342
- ### Resource Slice API (Alternative to Device Class API)
343
- The exact set of proposed API changes on Resource Slice can be seen below:
344
- ``` go
345
- // ResourceSliceSpec contains the information published by the driver in one ResourceSlice.
346
- type ResourceSliceSpec struct {
347
- ...
348
-
349
- // The extended resource name for all the devices in the ResourceSlice
350
- // advertised as
351
- //
352
- // +optional
353
- ExtendedResourceName *string
354
- }
355
-
356
- // Device represents one individual hardware instance that can be selected based
357
- // on its attributes. Besides the name, exactly one field must be set.
358
- // +k8s:deepcopy-gen=true
359
- type Device struct {
360
- // Name is unique identifier among all devices managed by
361
- // the driver in the pool. It must be a DNS label.
362
- //
363
- // +required
364
- Name string ` json:"name"`
365
- ...
366
-
367
- // ExtendedResourceName is the extended resource name
368
- // the device is advertised as. It must be a DNS label.
369
- // It overrides the ExtendedResourceName at ResourceSlice if both are
370
- // present.
371
- //
372
- // +optional
373
- ExtendedResourceName *string
374
- }
375
- ```
376
-
377
- The devices can be advertised with an extended resource name. The extended
378
- resource name can be specified on each individual device. Different
379
- devices can be advertised as different extended resource name, or not
380
- advertised as extended resource at all.
381
-
382
- Alternatively, the extended resource name can be specified at the
383
- ` ResourceSlice ` level, then all the devices in the resource slice are
384
- advertised as the given extended resource name. If a device has a different
385
- extended resource name than that given in the ` ResoureSlice ` , the device's
386
- extended resource name is used for that device.
387
-
388
-
389
338
### Resource Claim API
390
339
391
340
A special resource claim object is created to keep track of device allocations for
@@ -415,8 +364,8 @@ garbage collector.
415
364
preBind phase. The in-memory one in the assumed cache is created earlier
416
365
during Reserve phase.
417
366
* It is * deleted*
418
- * together with the owning pod's deletion.
419
- * by the scheduler dynamic resource plugin during unReserve phase.
367
+ * either together with the owning pod's deletion.
368
+ * or by the scheduler dynamic resource plugin during unReserve phase.
420
369
* It is * read* by the kubelet DRA device driver to prepare the devices listed
421
370
therein when preparing to run the pod.
422
371
@@ -446,7 +395,8 @@ then the name of the `DeviceRequest` is "c0-e2".
446
395
447
396
A new field ` extendedResourceClaimStatus ` is added to Pod's status to track
448
397
the special resouceclaim object created for the extended resource requests
449
- in the pod.
398
+ in the pod. This is needed for kublet to pass the devices allocated by driver
399
+ to the containers in the pod.
450
400
451
401
``` go
452
402
// PodExtendedResourceClaimStatus is stored in the PodStatus for each extended
@@ -506,11 +456,11 @@ status:
506
456
- names :
507
457
- container-name
508
458
- foo.domain/bar
509
- - c1 -e2
459
+ - c0 -e2
510
460
resourceClaimName : ccc-gpu-57999b9c4c-vpq68-gpu-8s27z
511
461
` ` `
512
- where ` deviceRequest` name is "c1 -e2", and container-name is the 2nd container
513
- in the pod, foo.domain/bar is the 3rd extended resource in the container.
462
+ where ` deviceRequest` name is "c0 -e2", and container-name is the first container
463
+ in the pod, foo.domain/bar is the 3rd extended resource in the container's requests .
514
464
515
465
Note the validations for extendedResourceClaimStatus are different from the
516
466
validations for resourceClaimStatuses.
0 commit comments