Open
Description
LocalAI version:
1.30.0 Latest.
Environment, CPU architecture, OS, and Version:
Window server 2022. Xeon E5 2670v2. GPU Geforece GTX 1070
Describe the bug
LocalAI using CPU instead of GPU. CUDA remains 0% When calling chat completeion.
To Reproduce
Expected behavior
Logs
Additional context
Configure in env:
Docker Compose:
version: '3.6'
services:
api:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
image: quay.io/go-skynet/local-ai:master-cublas-cuda12
tty: true # enable colorized logs
restart: always # should this be on-failure ?
ports:
- 8080:8080
env_file:
- .env
volumes:
- ./models:/models
command: ["/usr/bin/local-ai" ]