Skip to content

Commit 41ca602

Browse files
committed
keep device class for advertising extended resource, removed the alternatives
1 parent f29984f commit 41ca602

File tree

1 file changed

+28
-79
lines changed
  • keps/sig-scheduling/5004-dra-extended-resource

1 file changed

+28
-79
lines changed

keps/sig-scheduling/5004-dra-extended-resource/README.md

+28-79
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@
99
- [Proposal](#proposal)
1010
- [Design Details](#design-details)
1111
- [Device Class API](#device-class-api)
12-
- [Resource Slice API (Alternative to Device Class API)](#resource-slice-api-alternative-to-device-class-api)
1312
- [Resource Claim API](#resource-claim-api)
1413
- [Pod API](#pod-api)
1514
- [Scheduling for Extended Resource backed by DRA](#scheduling-for-extended-resource-backed-by-dra)
@@ -76,11 +75,11 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
7675

7776
Extended resource provides a simple, concise approach to describe resource
7877
capacity, and resource consumption. In constrast, Dynamic Resource
79-
Allocation (DRA) provides a more expressive, flexible, powerful approach, yet
78+
Allocation (DRA) provides a more expressive, flexible approach, yet
8079
more complicated, and harder to use.
8180

8281
This KEP provides a solution to enable cluster administrators to advertise the
83-
dynamic resources (in `ResourceSlice`) as extended resource in `DeviceClass`,
82+
dynamic resources (in `ResourceSlice`) as extended resource via `DeviceClass`.
8483
and enables the application developers, and operators to continue using
8584
extended resource to request for such resources.
8685

@@ -211,8 +210,8 @@ non-goals of this KEP.
211210
212211
### Goals
213212
214-
* Introduce the ability for DRA to advertise extended resources, and for the
215-
scheduler to consider them for allocation.
213+
* Introduce the ability to advertise DRA resources as extended resources, and
214+
for the scheduler to consider them for allocation.
216215
217216
* Enable application operators to use the existing extended resource request in
218217
pod spec to request for DRA resources.
@@ -224,16 +223,15 @@ non-goals of this KEP.
224223
* Device plugin API must not change. The existing device plugin drivers must
225224
continue working without change.
226225
227-
* DRA driver API change must be minimal, if there is any. Core kubernetes
228-
(kube-scheduler, kubelet) is preferred over DRA driver for any change needed
229-
to support the feature.
226+
* DRA driver API must not change. Core kubernetes (kube-scheduler, kubelet) is
227+
preferred over DRA driver for any change needed to support the feature.
230228
231229
### Non-Goals
232230
233231
* Minimize kubelet or kube-scheduler changes. The feature requires necessary
234232
changes in both scheduling and actuation.
235233
236-
* Keep advertising pod.status.Capacity for extended resources backed by DRA.
234+
* Keep advertising `node.status.Capacity` for extended resources backed by DRA.
237235
It is used for extended resources backed by device plugin only.
238236

239237
## Proposal
@@ -242,51 +240,48 @@ The basic idea is the following:
242240

243241
1. Introduce `extended resource backed by DRA`. It is like the current extended
244242
resource backed by device plugin, in that, it has a string name, and a
245-
discrete countable quantity. Its capacity is provided through dynamic
246-
resource `ResourceSlice`, its consumption is specified through pod's extended
243+
discrete countable quantity. Its capacity is provided through DRA
244+
`ResourceSlice`, its consumption is specified through pod's extended
247245
resource request.
248246
1. Introduce a field `ExtendedResourceName` to `DeviceClass` to allow cluster
249247
administrators to advertise certain class of devices as extended resource.
250-
1. Alternatively, introduce a field `ExtendedResourceName` to `ResourceSlice`
251-
and `Device` to allow cluster administrators to configure DRA device driver
252-
to advertise certain devices as extended resource.
253248
1. Introduce a special `ResourceClaim` object to keep track of device allocations
254249
for all extended resource requests backed by DRA for a pod. kube-scheduler
255250
uses DRA scheduling algorithm to fit pod's extended resource request to a
256-
node that advertises the extended resource in DRA `ResorceSlice` or traditional
257-
extended resources. When using DRA devices, it creates a special `ResourceClaim`
258-
for the pod with the allocation result recording which devices were picked. More
259-
details on this special `ResourceClaim` follow below. When using extended
260-
resources advertised for a node by device plugin, the existing resource
261-
tracking reserves them.
251+
node that advertises the extended resource in DRA `ResorceSlice` or extended
252+
resources backed by device plugin. When using DRA devices, it creates a
253+
special `ResourceClaim` for the pod with the allocation result recording
254+
which devices were picked. More details on this special `ResourceClaim`
255+
follow below. When using extended resources advertised for a node by device
256+
plugin, the existing resource tracking reserves them.
262257
1. kubelet asks DRA driver to prepare devices in the special `ResourceClaim`,
263-
and pass the devices to containers with the extended resource requests.
258+
and pass the devices to containers in a pod with the extended resource requests.
264259

265260
Some quick clarifications around the basic concepts: extended resource backed by
266261
device plugin, extended resource backed by DRA, and dynamic resource.
267262

268263
* extended resource backed by device plugin uses pod's
269264
spec.containers[].resources.requests to request for resources, it consumes the capacity
270-
from node's status.capacity. It is of type: string, int64
265+
from node's status.capacity. It is of type (string, int64)
271266
* dynamic resource uses `ResourceClaim` to request for resources, and
272267
`ResourceSlice` to provide resource capacity. A pod asks for resources through
273268
resource claim requests in pod's spec.resources.claims. Dynamic resource type
274269
is described in resource slice, simply speaking, it is a list of devices, with
275270
each device being described as structured parameters.
276271
* extended resource backend by DRA is a combination of the two above. It uses pods'
277272
spec.containers[].resources.requests to request for resources, and uses
278-
`ResourceSlice` to provide resource capacity. Hence, it is of type: string,
279-
int64 on the consumption side, and list of devices with a common
273+
`ResourceSlice` to provide resource capacity. Hence, it is of type (string, int64)
274+
on the consumption side, and list of devices with a common
280275
`ExtendedResourceName` on the capacity side.
281276

282277
With these additions in place, the DRA devices can be consumed by extended resource
283278
requests, or by DRA resouce claims. The scheduler has everything it needs to support
284279
the dynamic allocation of devices to requests made through extended resource and
285280
resource claims. No static partition of resources between extended resources and
286281
resource claims is needed. The kubelet and DRA driver has everything they need
287-
to admit and pass the allocated devices to the pod to run.
282+
to admit a pod and pass the allocated devices to the containers in the pod to run.
288283

289-
Note the following cluster setup requirement and constraint:
284+
Note the following cluster setup configuration and constraint:
290285

291286
* One node in cluster has a extended resource backed by DRA, and another node in the
292287
cluster has the same named extended resource backend by device plugin.
@@ -339,53 +334,6 @@ type DeviceClassSpec struct {
339334
}
340335
```
341336

342-
### Resource Slice API (Alternative to Device Class API)
343-
The exact set of proposed API changes on Resource Slice can be seen below:
344-
```go
345-
// ResourceSliceSpec contains the information published by the driver in one ResourceSlice.
346-
type ResourceSliceSpec struct {
347-
...
348-
349-
// The extended resource name for all the devices in the ResourceSlice
350-
// advertised as
351-
//
352-
// +optional
353-
ExtendedResourceName *string
354-
}
355-
356-
// Device represents one individual hardware instance that can be selected based
357-
// on its attributes. Besides the name, exactly one field must be set.
358-
// +k8s:deepcopy-gen=true
359-
type Device struct {
360-
// Name is unique identifier among all devices managed by
361-
// the driver in the pool. It must be a DNS label.
362-
//
363-
// +required
364-
Name string `json:"name"`
365-
...
366-
367-
// ExtendedResourceName is the extended resource name
368-
// the device is advertised as. It must be a DNS label.
369-
// It overrides the ExtendedResourceName at ResourceSlice if both are
370-
// present.
371-
//
372-
// +optional
373-
ExtendedResourceName *string
374-
}
375-
```
376-
377-
The devices can be advertised with an extended resource name. The extended
378-
resource name can be specified on each individual device. Different
379-
devices can be advertised as different extended resource name, or not
380-
advertised as extended resource at all.
381-
382-
Alternatively, the extended resource name can be specified at the
383-
`ResourceSlice` level, then all the devices in the resource slice are
384-
advertised as the given extended resource name. If a device has a different
385-
extended resource name than that given in the `ResoureSlice`, the device's
386-
extended resource name is used for that device.
387-
388-
389337
### Resource Claim API
390338

391339
A special resource claim object is created to keep track of device allocations for
@@ -415,8 +363,8 @@ garbage collector.
415363
preBind phase. The in-memory one in the assumed cache is created earlier
416364
during Reserve phase.
417365
* It is *deleted*
418-
* together with the owning pod's deletion.
419-
* by the scheduler dynamic resource plugin during unReserve phase.
366+
* either together with the owning pod's deletion.
367+
* or by the scheduler dynamic resource plugin during unReserve phase.
420368
* It is *read* by the kubelet DRA device driver to prepare the devices listed
421369
therein when preparing to run the pod.
422370

@@ -446,7 +394,8 @@ then the name of the `DeviceRequest` is "c0-e2".
446394

447395
A new field `extendedResourceClaimStatus` is added to Pod's status to track
448396
the special resouceclaim object created for the extended resource requests
449-
in the pod.
397+
in the pod. This is needed for kublet to pass the devices allocated by driver
398+
to the containers in the pod.
450399

451400
```go
452401
// PodExtendedResourceClaimStatus is stored in the PodStatus for each extended
@@ -506,11 +455,11 @@ status:
506455
- names:
507456
- container-name
508457
- foo.domain/bar
509-
- c1-e2
458+
- c0-e2
510459
resourceClaimName: ccc-gpu-57999b9c4c-vpq68-gpu-8s27z
511460
```
512-
where `deviceRequest` name is "c1-e2", and container-name is the 2nd container
513-
in the pod, foo.domain/bar is the 3rd extended resource in the container.
461+
where `deviceRequest` name is "c0-e2", and container-name is the first container
462+
in the pod, foo.domain/bar is the 3rd extended resource in the container's requests.
514463

515464
Note the validations for extendedResourceClaimStatus are different from the
516465
validations for resourceClaimStatuses.

0 commit comments

Comments
 (0)