Skip to content

Commit c01d38c

Browse files
authored
fix: (GH-#340) Add RuntimeClass for GPU sharing (#360)
Signed-off-by: Saurabh Kumar Singh <[email protected]>
1 parent 548af88 commit c01d38c

File tree

2 files changed

+8
-0
lines changed

2 files changed

+8
-0
lines changed

docs/gpu-sharing/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,11 @@ GPU sharing is disabled by default. To enable it, add the following flag to the
1818
--set "global.gpuSharing=true"
1919
```
2020

21+
### RuntimeClass Requirement
22+
KAI Scheduler requires the use of a specific RuntimeClass for GPU sharing. The recommended RuntimeClass is `nvidia`.
23+
24+
KAI explicitly sets `runtimeClassName: "nvidia"` in the resource reservation pod spec. Ensure that your cluster has the `nvidia` RuntimeClass configured — this is typically provided by the NVIDIA device plugin.
25+
2126
### GPU Sharing Pod
2227
To submit a pod that can share a GPU device, run this command:
2328
```

pkg/binder/binding/resourcereservation/resource_reservation.go

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,8 @@ const (
4242
unknownGpuIndicator = "-1"
4343
)
4444

45+
var runtimeClassName = "nvidia"
46+
4547
type service struct {
4648
fakeGPuNodes bool
4749
kubeClient client.WithWatch
@@ -449,6 +451,7 @@ func (rsc *service) createResourceReservationPod(
449451
},
450452
Spec: v1.PodSpec{
451453
NodeName: nodeName,
454+
RuntimeClassName: &runtimeClassName,
452455
ServiceAccountName: rsc.serviceAccountName,
453456
Containers: []v1.Container{
454457
{

0 commit comments

Comments
 (0)