Skip to content

GPU VRAM detection #7909

@mudler

Description

@mudler

After #7891 and #7907, which has obviously very good upsides such as smaller images, and simplified UX (one image to rule them all), comes also with some drawbacks.

One of the few is that now we can't rely anymore on the GPU vendor binaries (such as rocm-smi, vulkaninfo etc) to be present in the container image. This leaves the user with 3 options I can think of at the moment:

  • Pre-execute a LocalAI script on start to install the required tools, or build a container image of LocalAI with the tools required included
  • Mount the binaries in the image manually from the host
  • Run LocalAI outside the container image in the host (with the tools installed)

This issue is mainly as a discussion point on how to tackle this. There is some trade-offs here, between the container image sizes, UX and VRAM monitoring features. We can of course still build separate container images for each of the GPU vendors. This might minimize impact of this, but would be nice to reduce the strain on the CI.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions