Skip to content

KAI scheduler not working with AWS EKS Auto Mode #378

@msaeedevops

Description

@msaeedevops

Hello
I am running eks cluster and I tried deploy KAI Scheduler on it.
Deployed nvidia device plugin and gpu operator.
Spun gpu nodes via karpenter and created queue as well

After spinning nodes up, If I check daemonset and device plugin pods I get this,

kubectl get daemonset -n kube-system | grep nvidia
nvidia-device-plugin                      0         0         0       0            0           accelerator=nvidia-gpu,karpenter.sh/nodepool=ml-pool                               161m
nvidia-device-plugin-mps-control-daemon   0         0         0       0            0           accelerator=nvidia-gpu,karpenter.sh

kubectl get pods -n kube-system -l app.kubernetes.io/name=nvidia-device-plugin -o wide
No resources found in kube-system namespace.

Does it mean its KAI scheduler is not supported by EKS Auto Mode or am I missing something

I tried using same labels and taints in the EKS Node pool which is provisioning the nodes but no luck.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions