-
Notifications
You must be signed in to change notification settings - Fork 45
Support both CDI and Legacy NVIDIA Container Runtime modes #459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
3f0cbac
ece33ef
972c58c
9a3ba69
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,39 @@ | ||
[required-extensions] | ||
nvidia-container-runtime = "v1" | ||
std = { version = "v1", helpers = ["default"] } | ||
kubelet-device-plugins = "v1" | ||
std = { version = "v1", helpers = ["default", "is_array"] } | ||
|
||
+++ | ||
### generated from the template file ### | ||
accept-nvidia-visible-devices-as-volume-mounts = {{default true settings.nvidia-container-runtime.visible-devices-as-volume-mounts}} | ||
accept-nvidia-visible-devices-envvar-when-unprivileged = {{default false settings.nvidia-container-runtime.visible-devices-envvar-when-unprivileged}} | ||
|
||
[nvidia-container-cli] | ||
root = "/" | ||
path = "/usr/bin/nvidia-container-cli" | ||
environment = [] | ||
ldconfig = "@/sbin/ldconfig" | ||
|
||
[nvidia-container-runtime] | ||
{{#if settings.kubelet-device-plugins.nvidia.device-list-strategy}} | ||
{{~#if (is_array settings.kubelet-device-plugins.nvidia.device-list-strategy) ~}} | ||
{{~#if (eq settings.kubelet-device-plugins.nvidia.device-list-strategy.[0] "cdi-cri") ~}} | ||
mode="cdi" | ||
Comment on lines
+19
to
+20
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it expected that "cdi-cri" will always be the first item of the array? What is the behavior if we have multiple items in the array, and what if "cdi-cri" is the second item in the array? May not be blocking. But it may be good to implement some helper like "has/contains" - similar to what they did in the device plugin - https://github.com/NVIDIA/k8s-device-plugin/blob/6f41f70c43f8da1357f51f64cf60431acc74141f/deployments/helm/nvidia-device-plugin/templates/_helpers.tpl#L178. Also a note here - I was going to warn the index out of bound concern, but looks like the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Regarding the conversation about custom helper, we had one, we were advice not to add it at least for this case: |
||
{{~else~}} | ||
mode="legacy" | ||
{{~/if~}} | ||
{{~else~}} | ||
{{~#if (eq settings.kubelet-device-plugins.nvidia.device-list-strategy "cdi-cri") ~}} | ||
mode="cdi" | ||
{{~else~}} | ||
mode="legacy" | ||
{{~/if~}} | ||
{{/if}} | ||
{{else}} | ||
mode="legacy" | ||
{{/if}} | ||
Comment on lines
+21
to
+33
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't love how the nested if else block here and multiple repeated `mode="legacy". It looks like the only cases we want to set "cdi" is when the device-list-strategy is set as "cdi-cri" (in string or list form). Could we clean up the if-else logic here? On another note, this along with the comment I have above, it might be worth consider introducing a dedicated "device-list-strategy" helper? The logic will be simplified in rust code. Our custom helpers - https://github.com/bottlerocket-os/bottlerocket-core-kit/blob/develop/sources/api/schnauzer/src/v2/import/helpers.rs#L36-L88 |
||
|
||
[nvidia-container-runtime-hook] | ||
# For the legacy NVIDIA runtime, skip detecting the mode used in the | ||
# NVIDA Container Runtime. This prevents failures in the legacy NVIDIA runtime | ||
# when the selected mode is 'cdi'. | ||
skip-mode-detection = true |
Uh oh!
There was an error while loading. Please reload this page.