-
Notifications
You must be signed in to change notification settings - Fork 65
Description
I've been looking through the issue trackers in the repos and checking the documentation trying to better understand not only the basics of supporting dual GPU setups but being new to the AI space and the tooling often used, I'm unclear on exactly how I can best leverage a dual GPU setup.
I'll explain a couple of the points that I, as someone new to this space and LM Studio, was unclear on:
-
Will a single prompt/request to the model be sped up by the presence of 2+ GPUs -vs- just handling more than 1 prompt/request concurrently. I'm not looking to go parallel I was looking to go faster.
-
I'm assuming that with a dual GPU setup, it should be treating the VRAM as 'pooled' and therefore I would be able to load larger models if I understand correctly.
For context, my use here is for leveraging LM Studio as the backed for the Continue extension in VS Code so I'm looking to accelerate the developer experience with even faster responses and support for larger models. I don't really need to handle multiple requests concurrently in this case so I'm wondering if it's even going to be worth chasing a dual-GPU setup.
In my enthusiasm of starting off down this AI path I was just about to pony up the cash and get me a 2nd 5090 to experiment with and then realized that no there's no such thing as SLI anymore (out of the hardware game for a few), and no Windows won't really use both GPUs in a manner where its acting as one and speeding things up. It's really only useful in applications that are inherently aware of multi-GPU setups and natively know how to split their workloads across them or go parallel on processing, etc. Rendering engines, AI tooling, Simulations, etc.
What I would like to have seen was a little more details around the actual behavior I could expect, not just the basic support of. What could someone expect in how to use multi-GPU setups in particular with LM Studio.
Sorry if this isn't the right place to be asking these questions, but I figured it might be worth documenting or pointing folks to further resources that could answer those type of questions.