Platform
Linux/Arch
Lemonade Version
10.8.0
GPU / APU Model
AMD RX 9070XT
Component
stable-diffusion.cpp
Bug Description
On an AMD RX 9070XT, i have only 16GB of vram to load stable diffusion models. There are many different quants of stable diffusion models available, some of which fit and run well on the 9070XT.
But lemonade comes with only unquantized models, and provides no way to install quantized ones. That makes the stable diffusion support in lemonade basically useless to me!
z-image-turbo for example is 20.70gb. that is extremely overkill, considering i can easily run it on 16gb vram using these files:
https://huggingface.co/leejet/Z-Image-Turbo-GGUF
https://huggingface.co/unsloth/Qwen3-4B-Instruct-2507-GGUF
https://huggingface.co/diffusers/FLUX.1-vae/tree/main
which, that's a split setup: a seperate GGUF, text encoder, and VAE. this works in sd.cpp (confirmed by doing it in koboldcpp which has sd.cpp built in as well), lemonade just needs to add support for it
there are also single-file quants of z-image turbo available (if i understand correctly): https://huggingface.co/unsloth/Z-Image-Turbo-GGUF
to resolve this, please provide quantized versions of all stablediffusion.cpp models within the models menu (in addition to the unquantized ones), as well as the ability to download stablediffusion models from huggingface (see #2327)
Steps to Reproduce
- open the desktop app
- expand the StableDiffusion.cpp section in the models sidebar
- stare in horror at the filesizes
Expected vs Actual Behavior
Expected: Quantized models that run fast on local hardware
Actual: unquantized models that are far too big to fit on most local hardware, and also no option to download quantized models to use with stablediffusion.cpp
Log Output
Additional Context
No response
Platform
Linux/Arch
Lemonade Version
10.8.0
GPU / APU Model
AMD RX 9070XT
Component
stable-diffusion.cpp
Bug Description
On an AMD RX 9070XT, i have only 16GB of vram to load stable diffusion models. There are many different quants of stable diffusion models available, some of which fit and run well on the 9070XT.
But lemonade comes with only unquantized models, and provides no way to install quantized ones. That makes the stable diffusion support in lemonade basically useless to me!
z-image-turbo for example is 20.70gb. that is extremely overkill, considering i can easily run it on 16gb vram using these files:
https://huggingface.co/leejet/Z-Image-Turbo-GGUF
https://huggingface.co/unsloth/Qwen3-4B-Instruct-2507-GGUF
https://huggingface.co/diffusers/FLUX.1-vae/tree/main
which, that's a split setup: a seperate GGUF, text encoder, and VAE. this works in sd.cpp (confirmed by doing it in koboldcpp which has sd.cpp built in as well), lemonade just needs to add support for it
there are also single-file quants of z-image turbo available (if i understand correctly): https://huggingface.co/unsloth/Z-Image-Turbo-GGUF
to resolve this, please provide quantized versions of all stablediffusion.cpp models within the models menu (in addition to the unquantized ones), as well as the ability to download stablediffusion models from huggingface (see #2327)
Steps to Reproduce
Expected vs Actual Behavior
Expected: Quantized models that run fast on local hardware
Actual: unquantized models that are far too big to fit on most local hardware, and also no option to download quantized models to use with stablediffusion.cpp
Log Output
Additional Context
No response