Allow custom scheduler to be specified in NIMService and NIMPipeline by shengnuo · Pull Request #489 · NVIDIA/k8s-nim-operator

shengnuo · 2025-05-12T16:46:20Z

This PR adds allows NIMs to be scheduled with a custom scheduler.

custom scheduler can be specified in .spec.services[].spec.schedulerName for NIMPipeline; and in .spec.schedulerName for NIMServices.
Default to default-scheduler if the .schedulerName is unspecified

Scheduled with Volcano

$ kubectl get nimpipeline llama3-1b-pipeline
NAME                 STATUS   AGE
llama3-1b-pipeline   Ready    31m
$ kubectl get nimpipeline llama3-1b-pipeline -o json | jq '.spec.services[].spec.schedulerName'
{
  "type": "volcano"
}
$ kubectl get pods meta-llama3
-1b-instruct-74cc4c5c9b-dtkdn -o json | jq '.spec.schedulerName'
"volcano"

Describing the created NIM pod, we can see that the pod was scheduled with Volcano

Events:
  Type     Reason     Age                 From     Message
  ----     ------     ----                ----     -------
  Normal   Scheduled  32m                 volcano  Successfully assigned nemo/meta-llama3-1b-instruct-74cc4c5c9b-dtkdn to nim-operator-9vg0z43

Scheduled with default-scheduler

If the scheduler is unspecified, NIM operator will choose the default-scheduler as default.

$ kubectl get nimservice meta-llama3-8b-instruct -o json | jq '.spec.scheduler'
null

Describing the NIM pod, we can see that the pod was scheduled with default-scheduler

Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  12m                   default-scheduler  Successfully assigned nemo/meta-llama3-8b-instruct-6d95cd589b-qp96s to nim-operator-9vg0z43

copy-pr-bot · 2025-05-12T16:46:23Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

visheshtanksale

Is there any value around having some global config that will set the scheduler for all the pods created by NIM Operator?
Having scheduler on the CRDs is necessary. But it is not a value thats going to change across multiple objects created using NIM Operator, so having a global config makes it easy to use

Signed-off-by: Sheng Lin <shelin@nvidia.com>

shivamerla

LGTM

…ler (NVIDIA#489) Signed-off-by: Sheng Lin <shelin@nvidia.com> Update manifests Signed-off-by: Sheng Lin <shelin@nvidia.com>

…ler (#489) Signed-off-by: Sheng Lin <shelin@nvidia.com> Update manifests Signed-off-by: Sheng Lin <shelin@nvidia.com>

shengnuo requested review from ArangoGutierrez, shivamerla, slu2011, varunrsekar and visheshtanksale as code owners May 12, 2025 16:46

shengnuo force-pushed the nim-custom-scheduler branch 5 times, most recently from ba431f6 to 9156d33 Compare May 12, 2025 19:33

varunrsekar reviewed May 12, 2025

View reviewed changes

Comment thread api/apps/v1alpha1/nimservice_types.go Outdated

visheshtanksale reviewed May 13, 2025

View reviewed changes

shengnuo force-pushed the nim-custom-scheduler branch 3 times, most recently from 2cf6f6e to ff6291f Compare May 16, 2025 15:10

shengnuo added 2 commits May 16, 2025 11:32

Allow NIMService and NIMPipeline to be scheduled with a custom scheduler

acde573

Signed-off-by: Sheng Lin <shelin@nvidia.com>

Update manifests

827617b

Signed-off-by: Sheng Lin <shelin@nvidia.com>

shengnuo force-pushed the nim-custom-scheduler branch from ff6291f to 827617b Compare May 16, 2025 15:32

shivamerla approved these changes May 16, 2025

View reviewed changes

Merge branch 'main' into nim-custom-scheduler

9444e6b

shengnuo merged commit 927e686 into NVIDIA:main May 16, 2025
9 checks passed

varunrsekar mentioned this pull request May 20, 2025

Backport changes from main #501

Merged

varunrsekar pushed a commit that referenced this pull request May 20, 2025

Allow NIMService and NIMPipeline to be scheduled with a custom schedu…

5a41954

…ler (#489) Signed-off-by: Sheng Lin <shelin@nvidia.com> Update manifests Signed-off-by: Sheng Lin <shelin@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow custom scheduler to be specified in NIMService and NIMPipeline#489

Allow custom scheduler to be specified in NIMService and NIMPipeline#489
shengnuo merged 3 commits intoNVIDIA:mainfrom
shengnuo:nim-custom-scheduler

shengnuo commented May 12, 2025 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented May 12, 2025

Uh oh!

Uh oh!

visheshtanksale left a comment

Uh oh!

shivamerla left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

shengnuo commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Scheduled with Volcano

Scheduled with default-scheduler

Uh oh!

copy-pr-bot Bot commented May 12, 2025

Uh oh!

Uh oh!

visheshtanksale left a comment

Choose a reason for hiding this comment

Uh oh!

shivamerla left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

shengnuo commented May 12, 2025 •

edited

Loading