Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re: Auto-generate CUDA bindings using Bindgen #4

Closed
wants to merge 10 commits into from

Conversation

termoshtt
Copy link
Member

@termoshtt termoshtt commented Nov 26, 2018

Imported from termoshtt/accel#58 by @LutzCle

@termoshtt termoshtt changed the title Re: Auto-generate CUDA bindings using Bindgen #58 Re: Auto-generate CUDA bindings using Bindgen Nov 26, 2018
@termoshtt
Copy link
Member Author

CI says it works if CUDA >= 9.0 for ubuntu 14/16/18.04 and CentOS6/7. Errors in other cases has some variation.

@termoshtt
Copy link
Member Author

@bheisler

Combines driver, runtime and cublas in a single crate. cublas should definitely be a separate crate, and arguably driver & runtime as well.
rust-cuda/wg#1 (comment)

This comment is closely related with this PR. Could you review this PR?

@termoshtt termoshtt added this to the 0.3.0 milestone Nov 28, 2018
}
panic!("CUDA cannot find");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows, I normally have to set my CUDA_LIBRARY_PATH to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\lib\x64. It would be nice if it could find that for me. I can send a pull request later adding this feature though.

A better error message would be:

cuda-sys cannot find the CUDA libraries. Please ensure that they are installed, or provide the path using the CUDA_LIBRARY_PATH environment variable. Searched:

  • /usr/local/cuda
  • /opt/cuda

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, hey, I addressed your issue with Windows in #6

.generate()
.expect("Unable to generate CUComplex bindings")
.write_to_file(out_path.join("cucomplex_bindings.rs"))
.expect("Unable to write CUComplex bindings");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this generates bindings for CUComplex and CUBLAS, but only links to CUBLAS. Was that intended? It seems like it should either link to CUComplex as well, or it should not generate bindings for it.

I'd prefer to have separate cudart-sys, cublas-sys and cucomplex-sys crates, and have cuda-sys only include the driver API, if that's possible.

Copy link

@kngwyu kngwyu Nov 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks only

pub type cuFloatComplex = float2;
pub type cuDoubleComplex = double2;
pub type cuComplex = cuFloatComplex;

are generated from CUComplex.h

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bheisler Thanks for the review!

I can add one more problem to the list. cublas doesn't compile using CUDA 8. It's missing the __half and __half2 types. Unfortunately, I couldn't get Bindgen to detect those types in cuda_fp16.h. As a work-around, I've added them manually in LutzCle/accel@74a82a4.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've create an issue to track *-sys crate separation #9

@bheisler
Copy link
Contributor

bheisler commented Nov 28, 2018

This might not be related to this pull request, but I'd like to have a system of Cargo features for CUDA versions like clang-sys has. That way, downstream crates like accel or RustaCUDA could provide their own CUDA version features that depend on the ones in cuda-sys.

@DiamondLovesYou
Copy link

I've created a fork of this PR which adds the ability to do two things:

  • Build the crate without having CUDA installed.
  • Detect the presence of and dynamically load the CUDA driver, runtime, and blas libraries.

The source is here. I've tested this on Windows which is the only "machine" I have with an Nvidia GPU, sadly, so am not sure if its working correctly on Linux/Mac.

@kngwyu
Copy link

kngwyu commented Nov 30, 2018

What we're dealing with here is the version compatibility problem, as @LutzCle mentions in termoshtt/accel#58.
We should do just one thing in a PR. I think to use libloading or not is an another problem, and should be discussed in an another thread.
To resolve version compatibility, using feature flags like cuda10 is a also possible solution, as @bheisler says.
I think ideally we should select feature-flag approach.
I think it has 2 advantages:

  • Users and tools(like rls and racer) can easily find the functions
  • It make it possible to compile without CUDA insatlled

But I don't know how many functions/structs are not compatible. So also I don't know how much effort is required.

@peterhj
Copy link

peterhj commented Dec 4, 2018

@termoshtt @bheisler @kngwyu I just made a PR (#5) that adds CUDA version features in the same style of clang-sys. I added just the feature flags themselves, I don't do anything with them (that can be saved for another set of commits/PR). Let me know what you think.

@termoshtt
Copy link
Member Author

Sorry for later replay 😢

I've created a fork of this PR which adds the ability to do two things:

Thanks @DiamondLovesYou . But it should be another PR after this PR is rejected.

Generate binding

We have two choice for generating CUDA binding

  • statically in git repo
  • dynamically at compile time

This PR drops static binding, which is generated by bindgen and my manual fixing based on CUDA8.0. Even If we choose static binding generation, this PR gives a script to generate bindings and worth merging with some fixes.

Build the crate without having CUDA installed.

I think it is not a critical problem since we link final executable without CUDA shared libraries in both case. CUDA/C header and CUDA runtime are basically distributed in a same way, and we can setup CUDA environment easily using nvidia/cuda container.

Multiple CUDA version support

Multi-version support can be done with both static/dynamic binding generation by different procedures.

For static binding generation, we generate multiple binding code statically, and switch using feature flags which generated binding to be used as @peterhj done in #5 . In this case we need to manage auto-generated binding rust codes.

For dynamic binding generation, we can ensure that cuda-sys will generate valid CUDA binding for various CUDA versions based on CI as this PR done (for >=9.0). Then we do not have generated binding code on the repository.

Users and tools(like rls and racer) can easily find the functions

This is a definite disadvantage of dynamic binding generation.

@termoshtt
Copy link
Member Author

Thanks reviews. I will reject this PR and split into several parts.

  • Auto generation of CUDA binding as another crate (for static binding generation)
  • Update to CUDA10-based binding

and then we will consider to split *-sys crates #9

@termoshtt termoshtt closed this Jan 9, 2019
@LutzCle LutzCle mentioned this pull request Jan 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants