Skip to content

[bug] main-cuda image crashes with SIGILL on AMD Zen 3 (Ryzen 9 5950X) after model load + LD_LIBRARY_PATH shadowing host libcuda #3814

@runyournode

Description

@runyournode

I wasn't able to run the main-cuda docker image (whisper-server) on my system.
I used claude code to help me debug (I am no expert in c++ or compilation), but it seems that the image is build using some compilation that eventually requires certain types or CPU during runtime.
(main (cpu) docker image works well on my system)

Here is the (claude generated) report:


Environment

Image ghcr.io/ggml-org/whisper.cpp:main-cuda (built 2026-05-15)
Host OS Ubuntu, Linux 6.8.0-117-generic x86_64
CPU AMD Ryzen 9 5950X (Zen 3) — AVX, AVX2, FMA, BMI2 — no AVX-512, no AMX
GPU NVIDIA GeForce RTX 4090 (24 GB VRAM, compute capability 8.9)
NVIDIA driver 580.159.04
CUDA (host) 13.0
Container runtime Docker with NVIDIA Container Toolkit
Model ggml-large-v3-turbo.bin

Bug 1 — LD_LIBRARY_PATH shadows the host-injected libcuda.so, causing CUDA_ERROR_SYSTEM_DRIVER_MISMATCH

The image sets LD_LIBRARY_PATH with /usr/local/cuda-13.0/compat as the first entry:

/usr/local/cuda-13.0/compat:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64

The cuda-compat-13-0 package installed in the image is version 580.65.06-0ubuntu1, so /usr/local/cuda-13.0/compat/libcuda.so.580.65.06 is loaded before the libcuda.so.580.159.04 injected by the NVIDIA Container Toolkit.

Result on startup:

ggml_cuda_init: failed to initialize CUDA: system has unsupported display driver / cuda driver combination

Workaround: prepend the real library paths in the container environment:

environment:
  - LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64

With this fix, CUDA initialises correctly and the RTX 4090 is detected:

ggml_cuda_init: found 1 CUDA devices (Total VRAM: 24063 MiB):
  Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes, VRAM: 24063 MiB

Bug 2 — SIGILL (exit code 132) on AMD Zen 3 after model load

After applying the LD_LIBRARY_PATH fix, the process still crashes with Illegal instruction (core dumped) immediately after model loading completes:

whisper_model_load: n_langs       = 100
Illegal instruction (core dumped)

This also reproduces with CUDA_VISIBLE_DEVICES="" (pure CPU path), ruling out a GPU-side issue.

Findings:

  • The whisper-server binary itself declares x86 ISA needed: x86-64-baseline (via readelf -n)
  • libggml-cpu.so.0 exports ggml_cpu_has_avx512, ggml_cpu_has_avx512_vbmi, ggml_cpu_has_avx512_vnni, ggml_cpu_has_avx512_bf16, and ggml_cpu_has_amx_int8, with amx.cpp compiled in
  • The Ryzen 9 5950X has no AVX-512 and no AMX
  • The crash occurs at the point where ggml initialises compute buffers / backend after model loading — consistent with a call path that emits AVX-512 or AMX instructions without a proper CPU feature guard

The main (CPU-only) image built from the same date does not exhibit this crash and runs the same model to a healthy HTTP server on the same machine.

This suggests libggml-cpu.so.0 in the main-cuda build was compiled with -march= flags that include AVX-512 or AMX, or that the AMX initialisation path (amx.cpp) is entered unconditionally regardless of the runtime CPU feature check.


Expected behaviour

whisper-server should start and serve requests on any x86-64 CPU that supports the baseline ISA, falling back gracefully when AVX-512/AMX are unavailable.

Actual behaviour

main-cuda is unusable on AMD Zen 3 and any other CPU without AVX-512/AMX.


Workaround

Use the ghcr.io/ggml-org/whisper.cpp:main (CPU-only) image with the full binary path as entrypoint:

image: ghcr.io/ggml-org/whisper.cpp:main
entrypoint: ["/app/build/bin/whisper-server", "--model", "...", "--host", "0.0.0.0", "--port", "9000"]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions