GPU VRAM detection

After https://github.com/mudler/LocalAI/pull/7891 and https://github.com/mudler/LocalAI/pull/7907, which has obviously very good upsides such as smaller images, and simplified UX (one image to rule them all), comes also with some drawbacks. 

One of the few is that now we can't rely anymore on the GPU vendor binaries (such as rocm-smi, vulkaninfo etc) to be present in the container image. This leaves the user with 3 options I can think of at the moment:

- Pre-execute a LocalAI script on start to install the required tools, or build a container image of LocalAI with the tools required included
- Mount the binaries in the image manually from the host
- Run LocalAI outside the container image in the host (with the tools installed)

This issue is mainly as a discussion point on how to tackle this. There is some trade-offs here, between the container image sizes, UX and VRAM monitoring features. We can of course still build separate container images for each of the GPU vendors. This might minimize impact of this, but would be nice to reduce the strain on the CI.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

GPU VRAM detection #7909

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

GPU VRAM detection #7909

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions