Skip to content

nvidia-k8s-device-plugin: add ldcache parsing for aarch64 patch #501

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
From be4ba83b821eea9050eefdb7e67df2d757c3795a Mon Sep 17 00:00:00 2001
Copy link
Contributor

@ytsssun ytsssun May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this one! Q - Do we have plan to upstream this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are already working on fixes 🎉 !

NVIDIA/nvidia-container-toolkit#1046

From: Jingwei Wang <[email protected]>
Date: Wed, 23 Apr 2025 17:17:35 +0000
Subject: [PATCH] fix ldcache parsing for aarch64

k8s-device-plugin carries its own nvidia-container-toolkit and uses
nvidia-ctk to generate the CDI specifications.

The architecture flag for aarch64 is currently missing from the
supported architecture flags list. This omission causes the getEntries
function to exclude all libraries found on aarch64 hosts. As a result
helper programs like nvidia-ctk are unable to generate CDI
specifications for the aarch64 architecture.

This fix adds the missing aarch64 architecture flag, using the same
value as defined in libnvidia-container[1], which maintains a more
comprehensive list of supported architectures.

[1]: https://github.com/NVIDIA/libnvidia-container/blob/a198166e1c1166f4847598438115ea97dacc7a92/src/ldcache.h#L21

Signed-off-by: Jingwei Wang <[email protected]>
---
.../nvidia-container-toolkit/internal/ldcache/ldcache.go | 3 +++
1 file changed, 3 insertions(+)

diff --git a/vendor/github.com/NVIDIA/nvidia-container-toolkit/internal/ldcache/ldcache.go b/vendor/github.com/NVIDIA/nvidia-container-toolkit/internal/ldcache/ldcache.go
index 4daf95b..455048c 100644
--- a/vendor/github.com/NVIDIA/nvidia-container-toolkit/internal/ldcache/ldcache.go
+++ b/vendor/github.com/NVIDIA/nvidia-container-toolkit/internal/ldcache/ldcache.go
@@ -47,6 +47,7 @@ const (
flagArchX8664 = 0x0300
flagArchX32 = 0x0800
flagArchPpc64le = 0x0500
+ flagArchAarch64 = 0x0a00
)

var errInvalidCache = errors.New("invalid ld.so.cache file")
@@ -195,6 +196,8 @@ func (c *ldcache) getEntries() []entry {
switch e.Flags & flagArchMask {
case flagArchX8664:
fallthrough
+ case flagArchAarch64:
+ fallthrough
case flagArchPpc64le:
bits = 64
case flagArchX32:
--
2.47.0

Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ Source2: nvidia-k8s-device-plugin-conf
Source3: nvidia-k8s-device-plugin-exec-start-conf
Source4: nvidia-k8s-device-plugin-mig-conf

Patch0001: 0001-fix-ldcache-parsing-for-aarch64.patch

BuildRequires: %{_cross_os}glibc-devel
Requires: %{name}(binaries)
Expand Down