[bug] main-cuda image crashes with SIGILL on AMD Zen 3 (Ryzen 9 5950X) after model load + LD_LIBRARY_PATH shadowing host libcuda

I wasn't able to run the main-cuda docker image (whisper-server) on my system.
I used claude code to help me debug (I am no expert in c++ or compilation), but it seems that the image is build using some compilation that eventually requires certain types or CPU during runtime.
(main (cpu) docker image works well on my system)

Here is the (claude generated) report: 

---

## Environment

| | |
|---|---|
| **Image** | `ghcr.io/ggml-org/whisper.cpp:main-cuda` (built 2026-05-15) |
| **Host OS** | Ubuntu, Linux 6.8.0-117-generic x86_64 |
| **CPU** | AMD Ryzen 9 5950X (Zen 3) — AVX, AVX2, FMA, BMI2 — **no AVX-512, no AMX** |
| **GPU** | NVIDIA GeForce RTX 4090 (24 GB VRAM, compute capability 8.9) |
| **NVIDIA driver** | 580.159.04 |
| **CUDA (host)** | 13.0 |
| **Container runtime** | Docker with NVIDIA Container Toolkit |
| **Model** | `ggml-large-v3-turbo.bin` |

---

## Bug 1 — `LD_LIBRARY_PATH` shadows the host-injected `libcuda.so`, causing `CUDA_ERROR_SYSTEM_DRIVER_MISMATCH`

The image sets `LD_LIBRARY_PATH` with `/usr/local/cuda-13.0/compat` as the **first** entry:

```
/usr/local/cuda-13.0/compat:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64
```

The `cuda-compat-13-0` package installed in the image is version `580.65.06-0ubuntu1`, so `/usr/local/cuda-13.0/compat/libcuda.so.580.65.06` is loaded **before** the `libcuda.so.580.159.04` injected by the NVIDIA Container Toolkit.

Result on startup:
```
ggml_cuda_init: failed to initialize CUDA: system has unsupported display driver / cuda driver combination
```

**Workaround:** prepend the real library paths in the container environment:
```yaml
environment:
  - LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64
```

With this fix, CUDA initialises correctly and the RTX 4090 is detected:
```
ggml_cuda_init: found 1 CUDA devices (Total VRAM: 24063 MiB):
  Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes, VRAM: 24063 MiB
```

---

## Bug 2 — SIGILL (exit code 132) on AMD Zen 3 after model load

After applying the LD_LIBRARY_PATH fix, the process still crashes with `Illegal instruction (core dumped)` immediately after model loading completes:

```
whisper_model_load: n_langs       = 100
Illegal instruction (core dumped)
```

This also reproduces with `CUDA_VISIBLE_DEVICES=""` (pure CPU path), ruling out a GPU-side issue.

**Findings:**
- The `whisper-server` binary itself declares `x86 ISA needed: x86-64-baseline` (via `readelf -n`)
- `libggml-cpu.so.0` exports `ggml_cpu_has_avx512`, `ggml_cpu_has_avx512_vbmi`, `ggml_cpu_has_avx512_vnni`, `ggml_cpu_has_avx512_bf16`, and `ggml_cpu_has_amx_int8`, with `amx.cpp` compiled in
- The Ryzen 9 5950X has **no AVX-512 and no AMX**
- The crash occurs at the point where ggml initialises compute buffers / backend after model loading — consistent with a call path that emits AVX-512 or AMX instructions without a proper CPU feature guard

The `main` (CPU-only) image built from the same date does **not** exhibit this crash and runs the same model to a healthy HTTP server on the same machine.

This suggests `libggml-cpu.so.0` in the `main-cuda` build was compiled with `-march=` flags that include AVX-512 or AMX, or that the AMX initialisation path (`amx.cpp`) is entered unconditionally regardless of the runtime CPU feature check.

---

## Expected behaviour

`whisper-server` should start and serve requests on any x86-64 CPU that supports the baseline ISA, falling back gracefully when AVX-512/AMX are unavailable.

## Actual behaviour

`main-cuda` is unusable on AMD Zen 3 and any other CPU without AVX-512/AMX.

---

## Workaround

Use the `ghcr.io/ggml-org/whisper.cpp:main` (CPU-only) image with the full binary path as entrypoint:
```yaml
image: ghcr.io/ggml-org/whisper.cpp:main
entrypoint: ["/app/build/bin/whisper-server", "--model", "...", "--host", "0.0.0.0", "--port", "9000"]
```


Image	`ghcr.io/ggml-org/whisper.cpp:main-cuda` (built 2026-05-15)
Host OS	Ubuntu, Linux 6.8.0-117-generic x86_64
CPU	AMD Ryzen 9 5950X (Zen 3) — AVX, AVX2, FMA, BMI2 — no AVX-512, no AMX
GPU	NVIDIA GeForce RTX 4090 (24 GB VRAM, compute capability 8.9)
NVIDIA driver	580.159.04
CUDA (host)	13.0
Container runtime	Docker with NVIDIA Container Toolkit
Model	`ggml-large-v3-turbo.bin`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] main-cuda image crashes with SIGILL on AMD Zen 3 (Ryzen 9 5950X) after model load + LD_LIBRARY_PATH shadowing host libcuda #3814

Environment

Bug 1 — `LD_LIBRARY_PATH` shadows the host-injected `libcuda.so`, causing `CUDA_ERROR_SYSTEM_DRIVER_MISMATCH`

Bug 2 — SIGILL (exit code 132) on AMD Zen 3 after model load

Expected behaviour

Actual behaviour

Workaround

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[bug] main-cuda image crashes with SIGILL on AMD Zen 3 (Ryzen 9 5950X) after model load + LD_LIBRARY_PATH shadowing host libcuda #3814

Description

Environment

Bug 1 — LD_LIBRARY_PATH shadows the host-injected libcuda.so, causing CUDA_ERROR_SYSTEM_DRIVER_MISMATCH

Bug 2 — SIGILL (exit code 132) on AMD Zen 3 after model load

Expected behaviour

Actual behaviour

Workaround

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Bug 1 — `LD_LIBRARY_PATH` shadows the host-injected `libcuda.so`, causing `CUDA_ERROR_SYSTEM_DRIVER_MISMATCH`