Dynamic loading #11

romankoblov · 2019-01-08T15:24:19Z

Hello! Do you have plans to add support for dynamic loading cuda library at runtime?
Something like rust-dlopen for example.

With current approach this crate will cause crash if CUDA libraries is not found,
instead of raising error to user (so it will be possible to fallback to cpu/opencl code).

Also, with dynamic loading it will be possible to find CUDA libraries at runtime,
so it won't be necessary to do
export DYLD_LIBRARY_PATH=/usr/local/cuda/lib
in MacOS (and probably LD_LIBRARY_PATH in Linux).

The text was updated successfully, but these errors were encountered:

LutzCle · 2019-01-09T15:02:39Z

Dynamic loading by itself won't help, because some CUDA versions break binary compatibility (see termoshtt/accel#58 and the follow-ups in #4 and #12). In contrast, dynamic loading requires binary compatibility.

Do you have any pointers to other projects that implement dynamic loading for CUDA?

romankoblov · 2019-01-09T16:42:07Z

There is ongoing work in TensorFlow for example.
Also, there is a bit different approach in ArrayFire: they dynamically loading backend libraries (which linked to CUDA), but as for rust applications this will reduce safety. And also it is better to do it once, than for every app that requires optional CUDA support.

In terms of version compatibility I don't see much differences here, you still can check version of
library after dynamic loading and provide correct API for it.

LutzCle · 2019-01-09T18:44:24Z

Sorry, I misunderstood your first post. This is orthogonal to the version incompatibility issue.

Looking through tensorflow/tensorflow@f092c9d, what they're doing is creating a shim for each and every CUDA function. This makes it possible to switch between CUDA and the shim at runtime.

Not so sure if cuda-sys is the right place to do these kind of tricks, as they introduce some new trade-offs and the purpose of this crate is to provide bare-bones CUDA bindings. Off the top of my head:

Additional runtime overhead, because we would need a branch to test the is-CUDA-present? condition on each function call.
Code complexity, because the shims need to be written and maintained.

Perhaps someone with more experience than me could pitch in?

AndrewGaspar · 2019-01-09T21:41:11Z

I agree that this seems out of scope for cuda-sys.

What you could do, if you'd like to take the shim approach, is create a separate shim library that, for all intents and purposes, looks like the CUDA library, but does the necessary indirection described here. cuda-sys should be able to, once #4 is merged, automatically pick that up and use it as long as you set CUDA_LIBRARY_PATH correctly.

romankoblov · 2019-01-09T21:41:43Z

It can also make incompatibility issues easier, since you can choose function signature at runtime.
There is examples of sys crates with dynamic loading:

clang-sys
libudev-sys (Mozilla fork)

As you can see static pointers to library functions set, so no need for branch at each function call.
Code will be more complex than "just use bindgen", but as for clang-sys example there is not much difference between raw bindings vs runtime loading.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic loading #11

Dynamic loading #11

romankoblov commented Jan 8, 2019 •

edited

Loading

LutzCle commented Jan 9, 2019 •

edited

Loading

romankoblov commented Jan 9, 2019

LutzCle commented Jan 9, 2019

AndrewGaspar commented Jan 9, 2019

romankoblov commented Jan 9, 2019

Dynamic loading #11

Dynamic loading #11

Comments

romankoblov commented Jan 8, 2019 • edited Loading

LutzCle commented Jan 9, 2019 • edited Loading

romankoblov commented Jan 9, 2019

LutzCle commented Jan 9, 2019

AndrewGaspar commented Jan 9, 2019

romankoblov commented Jan 9, 2019

romankoblov commented Jan 8, 2019 •

edited

Loading

LutzCle commented Jan 9, 2019 •

edited

Loading