llama.cpp server docker not spinning up #422

gianlucagilardi · 2025-12-06T15:18:25Z

gianlucagilardi
Dec 6, 2025

Hello everyone,

I feel completely dumb, but apaprelntly i cannot find a way to make this work....

I have a pretty simple config. yaml to test the system:

# llama-swap YAML configuration
healthCheckTimeout: 500
logLevel: debug
metricsMaxInMemory: 1000
startPort: 10001

# Models section (REQUIRED)
models:
  "docker-oss":
    proxy: "http://127.0.0.1:${PORT}"
    cmd: >
      docker run --name ${MODEL_ID} --init --rm \
        --device=/dev/dri --device=/dev/kfd --group-add video \
        -e AMD_VULKAN_ICD=RADV \
        -v /usr/share/vulkan:/usr/share/vulkan:ro \
        -p ${PORT}:8080 \
        -v /home/user/models:/models:ro \
        ghcr.io/ggml-org/llama.cpp:server-vulkan \
        --model '/models/oss-120b/gpt-oss-120b.gguf' \
        -ngl 999 -c 4096 -b 512 --no-mmap -fa 1 -n -1
    name: "docker-oss"
    ttl: 60
    cmdStop: docker stop ${MODEL_ID}

I spin up llama-swap with:

docker run -it --rm -p 9292:8080 -v /home/user/models:/models -v /home/user/llama-swap/config.yaml:/app/config.yaml ghcr.io/mostlygeek/llama-swap:vulkan

llama swap start ccorrectly; i can access the web interface at :9292 but when i try to load the model (also, via llama swap gui) i get this error:

[DEBUG] Exclusive mode for group (default), stopping other process groups
[DEBUG] <docker-oss> SendLoadingState is nil or false, not streaming loading state
[DEBUG] <docker-oss> swapState() State transitioned from stopped to starting
[DEBUG] <docker-oss> Executing start command: docker run --name docker-oss --init --rm --device=/dev/dri --device=/dev/kfd --group-add video -e AMD_VULKAN_ICD=RADV -v /usr/share/vulkan:/usr/share/vulkan:ro -p 10001:8080 -v /home/user
/models:/models:ro ghcr.io/ggml-org/llama.cpp:server-vulkan --model /models/oss-120b/gpt-oss-120b.gguf -ngl 999 -c 4096 -b 512 --no-mmap -fa 1 -n -1, env:
[DEBUG] <docker-oss> swapState() State transitioned from starting to stopped

if i just try to run
docker run --name docker-oss --init --rm --device=/dev/dri --device=/dev/kfd --group-add video -e AMD_VULKAN_ICD=RADV -v /usr/share/vulkan:/usr/share/vulkan:ro -p 10001:8080 -v /home/user/models:/models:ro ghcr.io/ggml-org/llama.cpp:server-vulkan --model /models/oss-120b/gpt-oss-120b.gguf -ngl 999 -c 4096 -b 512 --no-mmap -fa 1 -n -1
llama.cpp server runs and answers correctly.

I am sure I am missing something but - for the life of me - I cannot understand what. I suspect the ", env:" part is messing up but have no idea how to get rid of it.

Any idea?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama.cpp server docker not spinning up #422

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

llama.cpp server docker not spinning up #422

Uh oh!

gianlucagilardi Dec 6, 2025

Replies: 0 comments

gianlucagilardi
Dec 6, 2025