Skip to content

fix: nvidia library resolution on merged-lib guests (e.g. Arch)#2014

Open
RomeoV wants to merge 1 commit into89luca89:mainfrom
RomeoV:fix/nvidia-arch-lib64-symlink
Open

fix: nvidia library resolution on merged-lib guests (e.g. Arch)#2014
RomeoV wants to merge 1 commit into89luca89:mainfrom
RomeoV:fix/nvidia-arch-lib64-symlink

Conversation

@RomeoV
Copy link

@RomeoV RomeoV commented Mar 8, 2026

Summary

  • Fixes nvidia-smi (and other 64-bit nvidia tools) failing with incompatible ELF class inside Arch Linux distroboxes on Fedora/Bazzite hosts with --nvidia
  • The root cause is that Fedora puts 32-bit nvidia libs in /usr/lib/ and 64-bit in /usr/lib64/, but Arch has /usr/lib64/usr/lib (merged-lib layout), so both map to the same directory and the 32-bit versions win
  • Two changes: (1) don't treat /usr/lib64 as a separate directory when it's a symlink, (2) search host lib64 before lib so 64-bit libs are mounted first and the existing "file exists" check skips 32-bit duplicates

Reproducer

# On Bazzite (or any Fedora immutable with nvidia)
distrobox create --name test --init --image archlinux:latest --nvidia
distrobox enter test -- nvidia-smi
# NVIDIA-SMI couldn't find libnvidia-ml.so library in your system.

Root cause visible with LD_DEBUG:

trying file=/usr/lib/libnvidia-ml.so.1
    (incompatible ELF class)

The file at /usr/lib/libnvidia-ml.so.1 is 32-bit (from host's /usr/lib/) while nvidia-smi is 64-bit.

Test plan

  • Verify nvidia-smi works in Arch distrobox on Fedora/Bazzite host with --nvidia
  • Verify nvidia integration still works on Fedora guest (real /usr/lib64/ directory)
  • Verify nvidia integration still works on Debian/Ubuntu guest (x86_64-linux-gnu layout)

🤖 Generated with Claude Code

On guests where /usr/lib64 is a symlink to /usr/lib (Arch Linux, etc.),
the nvidia init would mount 32-bit host libraries from /usr/lib/ into
/usr/lib/, then skip the 64-bit libraries from /usr/lib64/ because the
destination path already existed. This caused `nvidia-smi` and other
64-bit nvidia tools to fail with "incompatible ELF class".

Two changes fix this:

1. Only set lib64_dir="/usr/lib64/" when /usr/lib64 is a real directory,
   not a symlink. This avoids treating a merged-lib guest as split-lib.

2. Search host lib directories in lib64-first order so that 64-bit
   libraries are mounted before their 32-bit counterparts. The existing
   "file exists" check then correctly skips the 32-bit duplicates.

Reproducer: Bazzite (Fedora immutable) host + Arch Linux distrobox
with --nvidia flag. Bazzite ships 32-bit nvidia libs in /usr/lib/ for
Steam/Proton compatibility, while Arch expects 64-bit libs there.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant