-
Notifications
You must be signed in to change notification settings - Fork 39
Migrate ECS to CDI #482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate ECS to CDI #482
Conversation
The cdi-specs package provides additional files required for CDI to work in other orchestrators like ECS, or any other downstream builder that supports CDI. Signed-off-by: Arnaldo Garcia Rincon <[email protected]>
Force the CDI mode for the ECS configuration Signed-off-by: Arnaldo Garcia Rincon <[email protected]>
Enable the CDI feature and provide the CDI directories that should be used to look for CDI specifications Signed-off-by: Arnaldo Garcia Rincon <[email protected]>
The architecture flag for aarch64 is currently missing from the supported architecture flags list. This omission causes the getEntries function to exclude all libraries found on aarch64 hosts. As a result, helper programs like nvidia-ctk are unable to generate CDI specifications for the aarch64 architecture. This fix adds the missing aarch64 architecture flag, using the same value as defined in libnvidia-container[1], which maintains a more comprehensive list of supported architectures. [1]: https://github.com/NVIDIA/libnvidia-container/blob/a198166e1c1166f4847598438115ea97dacc7a92/src/ldcache.h#L21 Signed-off-by: Arnaldo Garcia Rincon <[email protected]>
@@ -3,3 +3,6 @@ root = "/" | |||
path = "/usr/bin/nvidia-container-cli" | |||
environment = [] | |||
ldconfig = "@/sbin/ldconfig" | |||
|
|||
[nvidia-container-runtime] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general question: Would using native support in some other runtime (e.g. containerd) be an option for bottlerocket at some point? Or would we need to build CDI support into bottlerocket directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @elezar , thanks for the interest in this PR!
We do have another PR (#459) to set up containerd as well for Kubernetes.
There are four use cases which require the usage of nvidia-container-runtime
:
- The ECS agent passes down the devices through the env variable
NVIDIA_VISIBLE_DEVICES
, which containerd doesn't know about. Luckily,nvidia-container-runtime
is aware of this environment variable, and performs the CDI modifications accordingly - For k8s, we have seen users rely on setting
NVIDIA_VISIBLE_DEVICES
toall
to bypass the resource restrictions and get access to all the GPUs. This use case is similar to 1), and as with it, containerd doesn't know about the environment variable. We do advise against this see Security Guidance. - Bottlerocket supports k8s 1.28, 1.29, 1.30, which don't support CDI, so we need the
nvidia-container-toolkit
to inject the prestart hook. We were doing something else to inject the prestart hook but it was unnecessarily complicated. Usingnvidia-container-runtime
is much simpler. See Packages: Drop shimpei oci add hook #458 if you are curious about what we used to do. - Support "management" containers, e.g.
dcgm-exporter
that require access to all GPUs by way ofNVIDIA_VISIBLE_DEVICES=all
. For this use case, I found the CDI specifications generated by the k8s device plugin were missing theall
device (see: Generate CDI specifications for the "all" device NVIDIA/k8s-device-plugin#1203), and caused failures in said containers.
For the normal, happy path use case, where users just use normal Kubernetes directives to configure their resources we do rely on containerd's native support for CDI (as you can see in #459). Its the NVIDIA_VISIBLE_DEVICES
support what we need nvidia-container-runtime
for .
You all don't have to add any CDI support for Bottlerocket (although contributions are welcome!). We use the tools you all provide to generate the CDI specifications, nvidia-ctk
for ECS and the Device Plugin for k8s. Besides that, we try to stay as close as possible to what you all do for other distros, and we try not to deviate too much unless strictly required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM - has the changes I expected from #471 and fixes the ARM issue.
"features": { | ||
"cdi": true | ||
}, | ||
"cdi-spec-dirs": ["/etc/cdi/"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: is there a reason that we ignore /var/run/cdi
? Is this folder not applicable on Bottlerocket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ECS, yeah, there is no need to read CDI specs from /var/run/cdi
as the only CDI specs provider will be nvidia-ctk
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArangoGutierrez this service definition (and the repo in general) might be interesting from the point of view of your current work.
Issue number:
Related to #470
Description of changes:
This series of commits enable CDI for ECS.
A new package
nvidia-container-toolkit-cdi-specs
provides an additional service to generate CDI specifications. This new service can be used by downstream builders that support CDI. This new package is automatically installed for ECS variants.The
nvidia-container-toolkit
configuration for ECS now forces the cdi mode. The reason for this is twofold:NVIDIA_VISIBLE_DEVICES
environment variable, which is what the ECS agent uses to "pass down" the devices that must be configured in the containerThe last commit in the series adds a patch for
nvidia-container-toolkit
. The changes in #471 didn't work becausenvidia-ctk
failed to generate the CDI specifications for aarch64 hosts. This is because the module that parsesldcache
is missing support for aarch64 hosts (see NVIDIA/nvidia-container-toolkit#1045). This is already being fixed upstream (see NVIDIA/nvidia-container-toolkit#1046).Testing done:
With
aws-ecs-2-nvidia
both x86_64 and aarch64 instances (g4dn.2xlarge and g5g.2xlarge):nvida-smoke
test defined in testsys:As an extra precaution, I diffed the libraries and binaries mounted on the containers, to make sure we aren't missing any library with the migration. I confirmed that the libraries are the same. Left CDI, Right Legacy:
Reviewers will notice the missing
/driver/nvidia/gpus/0000:00:1f.0
,/nvidia0
,/nvidiactl
,/nvidia-modeset
,/nvidia-uvm
,/nvidia-uvm-tools
mounts in CDI. This is because instead of mounting the devices, in CDI the devices are actually passed as Linux devices configured in the OCI specification. So instead of mounts, the devices are created by the container runtime under/dev
:Without these devices, workloads that depend on
cuda
wouldn't run (see this comment)Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.