llama.cpp allows for partial offload of model layers to CPU / RAM. It would be great if LocalAI showed RAM usage alongside VRAM usage.