cude issue : cant get  this to work ./run.sh --model 7b --with-cuda

this is what I get

meriem@Home:~/llama-gpt$ ./run.sh --model 7b --with-cuda
[+] Building 4.2s (30/30) FINISHED                                                                                                      docker:default
 => [llama-gpt-ui internal] load build definition from Dockerfile                                                                                 0.0s
 => => transferring dockerfile: 859B                                                                                                              0.0s
 => [llama-gpt-ui internal] load .dockerignore                                                                                                    0.0s
 => => transferring context: 82B                                                                                                                  0.0s
 => [llama-gpt-api-cuda-ggml internal] load build definition from ggml.Dockerfile                                                                 0.0s
 => => transferring dockerfile: 958B                                                                                                              0.0s
 => [llama-gpt-api-cuda-ggml internal] load .dockerignore                                                                                         0.0s
 => => transferring context: 2B                                                                                                                   0.0s
 => [llama-gpt-ui internal] load metadata for ghcr.io/ufoscout/docker-compose-wait:latest                                                         2.7s
 => [llama-gpt-ui internal] load metadata for docker.io/library/node:19-alpine                                                                    4.1s
 => [llama-gpt-api-cuda-ggml internal] load metadata for docker.io/nvidia/cuda:12.1.1-devel-ubuntu22.04                                           3.9s
 => [llama-gpt-api-cuda-ggml 1/5] FROM docker.io/nvidia/cuda:12.1.1-devel-ubuntu22.04@sha256:7012e535a47883527d402da998384c30b936140c05e2537158c  0.0s
 => [llama-gpt-api-cuda-ggml internal] load build context                                                                                         0.0s
 => => transferring context: 98B                                                                                                                  0.0s
 => CACHED [llama-gpt-api-cuda-ggml 2/5] RUN apt-get update && apt-get upgrade -y     && apt-get install -y git build-essential     python3 pyth  0.0s
 => CACHED [llama-gpt-api-cuda-ggml 3/5] COPY . .                                                                                                 0.0s
 => CACHED [llama-gpt-api-cuda-ggml 4/5] RUN python3 -m pip install --upgrade pip pytest cmake scikit-build setuptools fastapi uvicorn sse-starl  0.0s
 => CACHED [llama-gpt-api-cuda-ggml 5/5] RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.78                    0.0s
 => [llama-gpt-api-cuda-ggml] exporting to image                                                                                                  0.0s
 => => exporting layers                                                                                                                           0.0s
 => => writing image sha256:f5e9ee7b443f2271474e2b079c92971acc47b73f79fccd55f0df148e952f2190                                                      0.0s
 => => naming to docker.io/library/llama-gpt-llama-gpt-api-cuda-ggml                                                                              0.0s
 => [llama-gpt-ui base 1/3] FROM docker.io/library/node:19-alpine@sha256:8ec543d4795e2e85af924a24f8acb039792ae9fe8a42ad5b4bf4c277ab34b62e         0.0s
 => [llama-gpt-ui] FROM ghcr.io/ufoscout/docker-compose-wait:latest@sha256:ee1b58447dcf9ae2aaf84e5904ffc00ed5a983bf986535b19aeb6f2d4a7ceb8a       0.0s
 => [llama-gpt-ui internal] load build context                                                                                                    0.0s
 => => transferring context: 13.73kB                                                                                                              0.0s
 => CACHED [llama-gpt-ui base 2/3] WORKDIR /app                                                                                                   0.0s
 => CACHED [llama-gpt-ui base 3/3] COPY package*.json ./                                                                                          0.0s
 => CACHED [llama-gpt-ui dependencies 1/1] RUN npm ci                                                                                             0.0s
 => CACHED [llama-gpt-ui production 3/9] COPY --from=dependencies /app/node_modules ./node_modules                                                0.0s
 => CACHED [llama-gpt-ui build 1/2] COPY . .                                                                                                      0.0s
 => CACHED [llama-gpt-ui build 2/2] RUN npm run build                                                                                             0.0s
 => CACHED [llama-gpt-ui production 4/9] COPY --from=build /app/.next ./.next                                                                     0.0s
 => CACHED [llama-gpt-ui production 5/9] COPY --from=build /app/public ./public                                                                   0.0s
 => CACHED [llama-gpt-ui production 6/9] COPY --from=build /app/package*.json ./                                                                  0.0s
 => CACHED [llama-gpt-ui production 7/9] COPY --from=build /app/next.config.js ./next.config.js                                                   0.0s
 => CACHED [llama-gpt-ui production 8/9] COPY --from=build /app/next-i18next.config.js ./next-i18next.config.js                                   0.0s
 => CACHED [llama-gpt-ui production 9/9] COPY --from=ghcr.io/ufoscout/docker-compose-wait:latest /wait /wait                                      0.0s
 => [llama-gpt-ui] exporting to image                                                                                                             0.0s
 => => exporting layers                                                                                                                           0.0s
 => => writing image sha256:46253421e80f0ef70c85731eeb06ccd7c313254349a5bfa4c90136fab43f078e                                                      0.0s
 => => naming to docker.io/library/llama-gpt-llama-gpt-ui                                                                                         0.0s
WARN[0004] Found orphan containers ([llama-gpt-llama-gpt-api-1]) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up. 
[+] Running 2/2
 ✔ Container llama-gpt-llama-gpt-ui-1             Recreated                                                                                       0.1s 
 ✔ Container llama-gpt-llama-gpt-api-cuda-ggml-1  Created                                                                                         0.0s 
Attaching to llama-gpt-llama-gpt-api-cuda-ggml-1, llama-gpt-llama-gpt-ui-1
llama-gpt-llama-gpt-ui-1             | [INFO  wait] --------------------------------------------------------
llama-gpt-llama-gpt-ui-1             | [INFO  wait]  docker-compose-wait 2.12.1
llama-gpt-llama-gpt-ui-1             | [INFO  wait] ---------------------------
llama-gpt-llama-gpt-ui-1             | [DEBUG wait] Starting with configuration:
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - Hosts to be waiting for: [llama-gpt-api-cuda-ggml:8000]
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - Paths to be waiting for: []
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - Timeout before failure: 3600 seconds 
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - TCP connection timeout before retry: 5 seconds 
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - Sleeping time before checking for hosts/paths availability: 0 seconds
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - Sleeping time once all hosts/paths are available: 0 seconds
llama-gpt-llama-gpt-ui-1             | [DEBUG wait]  - Sleeping time between retries: 1 seconds
llama-gpt-llama-gpt-ui-1             | [DEBUG wait] --------------------------------------------------------
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Checking availability of host [llama-gpt-api-cuda-ggml:8000]
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown

thanks for helping

file ibnvidia-ml.so.1 seems ok

meriem@Home:~/llama-gpt$ find /usr -name libnvidia-ml.so.1
/usr/lib/i386-linux-gnu/libnvidia-ml.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cude issue : cant get this to work ./run.sh --model 7b --with-cuda #163

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

cude issue : cant get this to work ./run.sh --model 7b --with-cuda #163

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions