Feat: Select backend devices via arg #1184

stduhpf · 2026-01-09T21:12:16Z

The main goal of this PR is to improve user experience in multi-gpu setups, allowing to chose which model part gets sent to which device.

Cli changes:

Adds the --main-backend-device [device_name] argument to set the default backend
remove --clip-on-cpu, --vae-on-cpu and --control-net-cpu arguments
replace them respectively with the new --clip_backend_device [device_name], --vae-backend-device [device_name], --control-net-backend-device [device_name] arguments
add the --diffusion_backend_device (control the device used for the diffusion/flow models) and the --tae-backend-device
add --list-devices argument to print the list of available ggml devices and exit.
add --rpc argument to connect to a compatible GGML rpc server

C API changes (stable-diffusion.h):

Change the content of the sd_ctx_params_t struct.
void list_backends_to_buffer(char* buffer, size_t buffer_size) to write the details of the available buffers to a null-terminated char array. Devices are separated by newline characters (\n), and the name and description of the device are separated by \t character.
size_t backend_list_size() to get the size of the buffer needed for void list_backends_to_buffer
void add_rpc_device(const char* address); connect to a ggml RPC backend (from llama.cpp)

The default device selection should now consistently prioritize discrete GPUs over iGPUs.

For example if you want to run the text encoders on CPU, you'd need to use --clip_backend_device CPU instead of --clip-on-cpu

TODOS:

Different devices for different text encoders? (for models like SDXL / SD3.x / Flux.1)
Device for upscaler, photomaker and Vision models

Important: to use RPC, you need to add -DGGML_RPC=ON to the build. Additionally it requires either sd.cpp to be built with -DSD_USE_SYSTEM_GGML flag, or the RPC server to be built with -DCMAKE_C_FLAGS="-DGGML_MAX_NAME=128" -DCMAKE_CXX_FLAGS="-DGGML_MAX_NAME=128" (default is 64)

Fixes #1116

stable-diffusion.cpp

wbruna · 2026-01-09T22:15:30Z

Maybe the backend #if tests on model.cpp, upscaler.cpp, etc. should be changed to runtime tests, too?

Also: how hard would it be to support more than one backend with the same sd.cpp binaries - Vulkan and CUDA, for instance?

stduhpf · 2026-01-09T22:22:25Z

Maybe the backend #if tests on model.cpp, upscaler.cpp, etc. should be changed to runtime tests, too?

Good point.

Also: how hard would it be to support more than one backend with the same sd.cpp binaries - Vulkan and CUDA, for instance?

I think removing those #if tests and figuring out a way to build GGML with multipl backends should be enough?

Edit: Actually I'm not sure if the #if tests in model.cpp are necessary at all. I could still build with Vulkan enabled when removing those.

wbruna · 2026-01-09T22:46:54Z

Edit: Actually I'm not sure if the #if tests in model.cpp are necessary at all. I could still build with Vulkan enabled when removing those.

I believe it's leftover code. The SD_USE_FLASH_ATTENTION one on ggml_extend.h seems obsolete, too.

common.hpp (top one), qwen_image.hpp and z_image.hpp are trickier: they test for Vulkan for precision issues. z_image.hpp also has a #if GGML_USE_HIP for the same reason (this one is my fault 🙂).

stduhpf · 2026-01-09T22:52:30Z

I'm pretty sure ggml has runtime checks for the backend type. It would probably be better to use that instead.

CarlGao4 · 2026-01-10T14:14:56Z

So sd.cpp actually support multi-backend builds? Like SYCL+CUDA at the same time?

stduhpf · 2026-01-10T16:42:10Z

@CarlGao4 I'm not sure. I never sucessfully managed to build sd.cpp with multiple backends, but ggml should be able to handle that. I got it to build with both Vulkan and RPC, but it failed to send data to the RPC server, so I don't know if it would work with other backends (I had to add a way to connect to the RPC server via the CLI).

Edit: actually RPC works if GGML_MAX_NAME is set to the same value for both sd-cli and rpc-server

leejet · 2026-01-11T08:23:16Z

Support for multiple different backends can be achieved for third-party callers simply by switching the DLL/SO that supports the desired backend.

wbruna · 2026-01-11T20:57:17Z

--list-devices working on Linux, both on Vulkan and ROCm. But I'm getting a bit of garbage at the end of the Vulkan output:

Details

stduhpf · 2026-01-12T12:52:31Z

@wbruna I got to reproduce the garbage at the end only once in my many tests. I'm not sure what's going on there.

stduhpf · 2026-01-12T16:16:14Z

I've just realized I accidentaly included my RPC-related changes in the last commit. Since it's somewhat related should I leave them in, or should I keep that for a follow-up PR?

Select backend devices via arg

15585fb

loci-dev mentioned this pull request Jan 9, 2026

UPSTREAM PR #1184: Feat: Select backend devices via arg auroralabs-loci/stable-diffusion.cpp#14

Open

wbruna reviewed Jan 9, 2026

View reviewed changes

stable-diffusion.cpp Outdated Show resolved Hide resolved

fix build

350df04

stduhpf force-pushed the select-backend branch from 094ac2e to 350df04 Compare January 9, 2026 21:59

stduhpf added 2 commits January 11, 2026 20:43

show backend device description

b1434fd

CLI: add --list-devices arg

2d06765

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: Select backend devices via arg #1184

Feat: Select backend devices via arg #1184

stduhpf commented Jan 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

wbruna commented Jan 9, 2026

Uh oh!

stduhpf commented Jan 9, 2026 •

edited

Loading

Uh oh!

wbruna commented Jan 9, 2026

Uh oh!

stduhpf commented Jan 9, 2026 •

edited

Loading

Uh oh!

CarlGao4 commented Jan 10, 2026

Uh oh!

stduhpf commented Jan 10, 2026 •

edited

Loading

Uh oh!

leejet commented Jan 11, 2026

Uh oh!

wbruna commented Jan 11, 2026

Uh oh!

stduhpf commented Jan 12, 2026

Uh oh!

stduhpf commented Jan 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Feat: Select backend devices via arg #1184

Are you sure you want to change the base?

Feat: Select backend devices via arg #1184

Conversation

stduhpf commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

wbruna commented Jan 9, 2026

Uh oh!

stduhpf commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wbruna commented Jan 9, 2026

Uh oh!

stduhpf commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CarlGao4 commented Jan 10, 2026

Uh oh!

stduhpf commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leejet commented Jan 11, 2026

Uh oh!

wbruna commented Jan 11, 2026

Uh oh!

stduhpf commented Jan 12, 2026

Uh oh!

stduhpf commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

stduhpf commented Jan 9, 2026 •

edited

Loading

stduhpf commented Jan 9, 2026 •

edited

Loading

stduhpf commented Jan 9, 2026 •

edited

Loading

stduhpf commented Jan 10, 2026 •

edited

Loading

stduhpf commented Jan 12, 2026 •

edited

Loading