Skip to content

llama.cpp won't run when built for CUDA 13 #534

@wvdschel

Description

@wvdschel

zluda_trace logs (tarball/zip file)

No response

Description

I read at https://vosen.github.io/ZLUDA/blog/zluda-update-q3-2025/ that llama.cpp should now work with ZLUDA.

I tried building llama.cpp with the CUDA backend on Fedora 42, and tried running it with CUDA. This revealed a number of issues:

  1. Missing libraries:
$ ldd ./bin/llama-server 
        linux-vdso.so.1 (0x00007f5f2ef42000)
        libmtmd.so => /home/wim/src/llama.cpp/build-cuda/bin/libmtmd.so (0x00007f5f2ee87000)
        libcurl.so.4 => /lib64/libcurl.so.4 (0x00007f5f2ed95000)
        libllama.so => /home/wim/src/llama.cpp/build-cuda/bin/libllama.so (0x00007f5f2ea00000)
        libggml.so => /home/wim/src/llama.cpp/build-cuda/bin/libggml.so (0x00007f5f2ed8a000)
        libggml-cpu.so => /home/wim/src/llama.cpp/build-cuda/bin/libggml-cpu.so (0x00007f5f2e882000)
        libggml-cuda.so => /home/wim/src/llama.cpp/build-cuda/bin/libggml-cuda.so (0x00007f5f2c000000)
        libcuda.so.1 => /home/wim/Downloads/zluda/libcuda.so.1 (0x00007f5f2b800000)
        libggml-rpc.so => /home/wim/src/llama.cpp/build-cuda/bin/libggml-rpc.so (0x00007f5f2ed72000)
        libggml-base.so => /home/wim/src/llama.cpp/build-cuda/bin/libggml-base.so (0x00007f5f2eccf000)
...
        libcudart.so.13 => not found
        libcublas.so.13 => not found
        libcublasLt.so.13 => not found
        libamdhip64.so.6 => /lib64/libamdhip64.so.6 (0x00007f5f29400000)
...

  1. When using the missing libraries from a normal CUDA setup, the process launches but fails to initialize a CUDA device:
$ ./bin/llama-bench  -m ~/.cache/llama.cpp/google_gemma-3-27b-it-qat-q4_0-gguf_gemma-3-27b-it-q4_0.gguf -ngl 999
ggml_cuda_init: failed to initialize CUDA: CUDA driver version is insufficient for CUDA runtime version
| model                          |       size |     params | backend    | ngl |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | -------------------: |

Steps to reproduce

My build commands for llama.cpp (running in a container with CUDA installed):

cmake -B build-cuda -DGGML_CUDA=ON -DGGML_RPC=ON -DCMAKE_CUDA_ARCHITECTURES="75;86;89"
cmake --build build-cuda/ --config Release -j10

I'm using ZLUDA v5 from the github release archive.

ZLUDA version

5

Operating System

Fedora 42

GPU

AMD Ryzen AI Max 395+ (Radeon RX 8060S)

Metadata

Metadata

Assignees

Labels

zluda_trace logszluda_trace log files for a particular application

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions