Skip to content

Commit ea149df

Browse files
committed
keep device class for advertising extended resource, removed the alternatives
1 parent f29984f commit ea149df

File tree

1 file changed

+28
-78
lines changed
  • keps/sig-scheduling/5004-dra-extended-resource

1 file changed

+28
-78
lines changed

keps/sig-scheduling/5004-dra-extended-resource/README.md

+28-78
Original file line numberDiff line numberDiff line change
@@ -76,11 +76,11 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
7676

7777
Extended resource provides a simple, concise approach to describe resource
7878
capacity, and resource consumption. In constrast, Dynamic Resource
79-
Allocation (DRA) provides a more expressive, flexible, powerful approach, yet
79+
Allocation (DRA) provides a more expressive, flexible approach, yet
8080
more complicated, and harder to use.
8181

8282
This KEP provides a solution to enable cluster administrators to advertise the
83-
dynamic resources (in `ResourceSlice`) as extended resource in `DeviceClass`,
83+
dynamic resources (in `ResourceSlice`) as extended resource via `DeviceClass`.
8484
and enables the application developers, and operators to continue using
8585
extended resource to request for such resources.
8686

@@ -211,8 +211,8 @@ non-goals of this KEP.
211211
212212
### Goals
213213
214-
* Introduce the ability for DRA to advertise extended resources, and for the
215-
scheduler to consider them for allocation.
214+
* Introduce the ability to advertise DRA resources as extended resources, and
215+
for the scheduler to consider them for allocation.
216216
217217
* Enable application operators to use the existing extended resource request in
218218
pod spec to request for DRA resources.
@@ -224,16 +224,15 @@ non-goals of this KEP.
224224
* Device plugin API must not change. The existing device plugin drivers must
225225
continue working without change.
226226
227-
* DRA driver API change must be minimal, if there is any. Core kubernetes
228-
(kube-scheduler, kubelet) is preferred over DRA driver for any change needed
229-
to support the feature.
227+
* DRA driver API must not change. Core kubernetes (kube-scheduler, kubelet) is
228+
preferred over DRA driver for any change needed to support the feature.
230229
231230
### Non-Goals
232231
233232
* Minimize kubelet or kube-scheduler changes. The feature requires necessary
234233
changes in both scheduling and actuation.
235234
236-
* Keep advertising pod.status.Capacity for extended resources backed by DRA.
235+
* Keep advertising `node.status.Capacity` for extended resources backed by DRA.
237236
It is used for extended resources backed by device plugin only.
238237

239238
## Proposal
@@ -242,51 +241,48 @@ The basic idea is the following:
242241

243242
1. Introduce `extended resource backed by DRA`. It is like the current extended
244243
resource backed by device plugin, in that, it has a string name, and a
245-
discrete countable quantity. Its capacity is provided through dynamic
246-
resource `ResourceSlice`, its consumption is specified through pod's extended
244+
discrete countable quantity. Its capacity is provided through DRA
245+
`ResourceSlice`, its consumption is specified through pod's extended
247246
resource request.
248247
1. Introduce a field `ExtendedResourceName` to `DeviceClass` to allow cluster
249248
administrators to advertise certain class of devices as extended resource.
250-
1. Alternatively, introduce a field `ExtendedResourceName` to `ResourceSlice`
251-
and `Device` to allow cluster administrators to configure DRA device driver
252-
to advertise certain devices as extended resource.
253249
1. Introduce a special `ResourceClaim` object to keep track of device allocations
254250
for all extended resource requests backed by DRA for a pod. kube-scheduler
255251
uses DRA scheduling algorithm to fit pod's extended resource request to a
256-
node that advertises the extended resource in DRA `ResorceSlice` or traditional
257-
extended resources. When using DRA devices, it creates a special `ResourceClaim`
258-
for the pod with the allocation result recording which devices were picked. More
259-
details on this special `ResourceClaim` follow below. When using extended
260-
resources advertised for a node by device plugin, the existing resource
261-
tracking reserves them.
252+
node that advertises the extended resource in DRA `ResorceSlice` or extended
253+
resources backed by device plugin. When using DRA devices, it creates a
254+
special `ResourceClaim` for the pod with the allocation result recording
255+
which devices were picked. More details on this special `ResourceClaim`
256+
follow below. When using extended resources advertised for a node by device
257+
plugin, the existing resource tracking reserves them.
262258
1. kubelet asks DRA driver to prepare devices in the special `ResourceClaim`,
263-
and pass the devices to containers with the extended resource requests.
259+
and pass the devices to containers in a pod with the extended resource requests.
264260

265261
Some quick clarifications around the basic concepts: extended resource backed by
266262
device plugin, extended resource backed by DRA, and dynamic resource.
267263

268264
* extended resource backed by device plugin uses pod's
269265
spec.containers[].resources.requests to request for resources, it consumes the capacity
270-
from node's status.capacity. It is of type: string, int64
266+
from node's status.capacity. It is of type (string, int64)
271267
* dynamic resource uses `ResourceClaim` to request for resources, and
272268
`ResourceSlice` to provide resource capacity. A pod asks for resources through
273269
resource claim requests in pod's spec.resources.claims. Dynamic resource type
274270
is described in resource slice, simply speaking, it is a list of devices, with
275271
each device being described as structured parameters.
276272
* extended resource backend by DRA is a combination of the two above. It uses pods'
277273
spec.containers[].resources.requests to request for resources, and uses
278-
`ResourceSlice` to provide resource capacity. Hence, it is of type: string,
279-
int64 on the consumption side, and list of devices with a common
274+
`ResourceSlice` to provide resource capacity. Hence, it is of type (string, int64)
275+
on the consumption side, and list of devices with a common
280276
`ExtendedResourceName` on the capacity side.
281277

282278
With these additions in place, the DRA devices can be consumed by extended resource
283279
requests, or by DRA resouce claims. The scheduler has everything it needs to support
284280
the dynamic allocation of devices to requests made through extended resource and
285281
resource claims. No static partition of resources between extended resources and
286282
resource claims is needed. The kubelet and DRA driver has everything they need
287-
to admit and pass the allocated devices to the pod to run.
283+
to admit a pod and pass the allocated devices to the containers in the pod to run.
288284

289-
Note the following cluster setup requirement and constraint:
285+
Note the following cluster setup configuration and constraint:
290286

291287
* One node in cluster has a extended resource backed by DRA, and another node in the
292288
cluster has the same named extended resource backend by device plugin.
@@ -339,53 +335,6 @@ type DeviceClassSpec struct {
339335
}
340336
```
341337

342-
### Resource Slice API (Alternative to Device Class API)
343-
The exact set of proposed API changes on Resource Slice can be seen below:
344-
```go
345-
// ResourceSliceSpec contains the information published by the driver in one ResourceSlice.
346-
type ResourceSliceSpec struct {
347-
...
348-
349-
// The extended resource name for all the devices in the ResourceSlice
350-
// advertised as
351-
//
352-
// +optional
353-
ExtendedResourceName *string
354-
}
355-
356-
// Device represents one individual hardware instance that can be selected based
357-
// on its attributes. Besides the name, exactly one field must be set.
358-
// +k8s:deepcopy-gen=true
359-
type Device struct {
360-
// Name is unique identifier among all devices managed by
361-
// the driver in the pool. It must be a DNS label.
362-
//
363-
// +required
364-
Name string `json:"name"`
365-
...
366-
367-
// ExtendedResourceName is the extended resource name
368-
// the device is advertised as. It must be a DNS label.
369-
// It overrides the ExtendedResourceName at ResourceSlice if both are
370-
// present.
371-
//
372-
// +optional
373-
ExtendedResourceName *string
374-
}
375-
```
376-
377-
The devices can be advertised with an extended resource name. The extended
378-
resource name can be specified on each individual device. Different
379-
devices can be advertised as different extended resource name, or not
380-
advertised as extended resource at all.
381-
382-
Alternatively, the extended resource name can be specified at the
383-
`ResourceSlice` level, then all the devices in the resource slice are
384-
advertised as the given extended resource name. If a device has a different
385-
extended resource name than that given in the `ResoureSlice`, the device's
386-
extended resource name is used for that device.
387-
388-
389338
### Resource Claim API
390339

391340
A special resource claim object is created to keep track of device allocations for
@@ -415,8 +364,8 @@ garbage collector.
415364
preBind phase. The in-memory one in the assumed cache is created earlier
416365
during Reserve phase.
417366
* It is *deleted*
418-
* together with the owning pod's deletion.
419-
* by the scheduler dynamic resource plugin during unReserve phase.
367+
* either together with the owning pod's deletion.
368+
* or by the scheduler dynamic resource plugin during unReserve phase.
420369
* It is *read* by the kubelet DRA device driver to prepare the devices listed
421370
therein when preparing to run the pod.
422371

@@ -446,7 +395,8 @@ then the name of the `DeviceRequest` is "c0-e2".
446395

447396
A new field `extendedResourceClaimStatus` is added to Pod's status to track
448397
the special resouceclaim object created for the extended resource requests
449-
in the pod.
398+
in the pod. This is needed for kublet to pass the devices allocated by driver
399+
to the containers in the pod.
450400

451401
```go
452402
// PodExtendedResourceClaimStatus is stored in the PodStatus for each extended
@@ -506,11 +456,11 @@ status:
506456
- names:
507457
- container-name
508458
- foo.domain/bar
509-
- c1-e2
459+
- c0-e2
510460
resourceClaimName: ccc-gpu-57999b9c4c-vpq68-gpu-8s27z
511461
```
512-
where `deviceRequest` name is "c1-e2", and container-name is the 2nd container
513-
in the pod, foo.domain/bar is the 3rd extended resource in the container.
462+
where `deviceRequest` name is "c0-e2", and container-name is the first container
463+
in the pod, foo.domain/bar is the 3rd extended resource in the container's requests.
514464

515465
Note the validations for extendedResourceClaimStatus are different from the
516466
validations for resourceClaimStatuses.

0 commit comments

Comments
 (0)