Skip to content

Docker-in-Docker fails: llama-swap can't access Docker socket after UID/GID update #374

@alhadebe

Description

@alhadebe

🐛 Describe the bug

After the recent [commit f91a8b2](f91a8b2), llama-swap can no longer launch model containers when running Docker-in-Docker (mounting /var/run/docker.sock).

Error:

docker: permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock:
Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.51/images/create?fromImage=ghcr.io%2Fggml-org%2Fllama.cpp&tag=server-cuda":
dial unix /var/run/docker.sock: connect: permission denied

Everything worked correctly before this commit introduced the non-root (UID 10001) runtime user.


Expected behaviour

llama-swap should still be able to start model containers via the mounted Docker socket when running inside Docker (DinD mode), just as it did prior to the UID/GID change.


🧰 Operating system and version

  • OS: Ubuntu 24.04 LTS
  • GPU: NVIDIA RTX 3090
  • Docker: 27.1.1
  • NVIDIA Container Toolkit: 1.16.2
  • llama-swap image: ghcr.io/mostlygeek/llama-swap:cuda (latest post-f91a8b2)

⚙️ My Configuration

docker-compose.yml

services:
  llama-swap:
    container_name: llama-swap
    image: ghcr.io/mostlygeek/llama-swap:cuda
    network_mode: host
    restart: always
    volumes:
      - ./dockerdata/models:/models
      - ./dockerdata/llama-swap/config:/config
      - /var/run/docker.sock:/var/run/docker.sock
      - /usr/bin/docker:/usr/bin/docker
    pull_policy: always
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]

config.yaml (excerpt)

healthCheckTimeout: 300
logLevel: debug
startPort: 65001

models:
  "Qwen3-30B-A3B-Instruct-2507":
    cmd: |
      docker run --gpus=all --pull always --name Qwen3-30B-A3B-Instruct-2507
      --init --rm -p ${PORT}:8080 -v /home/al/dockerdata/models:/models ghcr.io/ggml-org/llama.cpp:server-cuda
      --model /models/Qwen_Qwen3-30B-A3B-Instruct-2507-Q4_K_M.gguf
      --ctx-size 70000 --gpu-layers 999 --jinja
      --cache-type-k q8_0 --cache-type-v q8_0
      --temp 0.7 --top-p 0.8
    ttl: 300
    cmdStop: docker stop Qwen3-30B-A3B-Instruct-2507

📜 Proxy Logs

[error] dial unix /var/run/docker.sock: connect: permission denied
[debug] failed to start model container "Qwen3-30B-A3B-Instruct-2507"

📜 Upstream Logs

docker: permission denied while trying to connect to the Docker daemon socket...

🧾 Diagnostics

Inside the container:

# docker exec -it llama-swap ls -l /var/run/docker.sock
srw-rw---- 1 root 988 0 Nov  2 00:37 /var/run/docker.sock

Host docker group ID = 988
The internal non-root user (UID 10001) is not a member of that group, so it cannot open the socket.


Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions