Skip to content

Let FSL build system handle CUDA configuration#3

Open
pauldmccarthy wants to merge 4 commits intoSPMIC-UoN:masterfrom
pauldmccarthy:mnt/build
Open

Let FSL build system handle CUDA configuration#3
pauldmccarthy wants to merge 4 commits intoSPMIC-UoN:masterfrom
pauldmccarthy:mnt/build

Conversation

@pauldmccarthy
Copy link

Hi @mcraig-ibme, I hope you're well! This PR contains some adjustments to the Makefile so that the FSL build system handles most of the CUDA compilation setup. This ensures that CUDIMOT binaries are compiled in the same manner as the other FSL CUDA tools, with:

  • Flags for compiling PTX code, which ensures forward compatibility with future GPU architectures
  • static linking of the CUDA toolkit, so end users don't need to install the CUDA toolkit, or e.g. set LD_LIBRARY_PATH or module load anything.

We have been compiling eddy, xfibres_gpu, probtrackx2_gpu and mmorf with this system since late 2022 (since the release of FSL 6.0.6), and have essentially eliminated all complaints about CUDA compatibility.

With the new system, the only requirement is that nvcc is on the $PATH. A couple of other environment variables can be set to control compilation:

  • CUDA_STATIC: By default, executables will be compiled so as to dynamically link against an installed CUDA toolkit, but you can request static linking by setting a CUDA_STATIC=1 environment variable. I will probably make this the default behaviour at some point, as we've found it nearly always preferable to dynamic linking, even during local development (the only downside is larger executable size).
  • GENCODEFLAGS: By default, a "fat binary" will be compiled, with support for a wide range of GPU architectures, and PTX for forward compatibility. This can cause compilation to be very slow, which can be a pain when working locally, so it is possible to override this and compile for a single architecture by setting (e.g.) GENCODEFLAGS="-gencode arch=compute_80,code=sm_80"

I've made a a couple of other cosmetic changes to the Makefile:

  • I've removed the FSL_GE_606 logic from Makefile as it seems that this logic is not required, with the separate Makefile.605.

  • I've also tweaked the compilation process a bit, removing the intermediate link_cudimot_gpu.o file - this seems to be unnecessary, and the compilation command were actually over-linking, by including both link_cudimot_gpu.o and ${CUDIMOT_CUDA_OBJS}.

Thanks!

@captainnova
Copy link

Is there any news on this?

I would like to compile a local version to debug #4, but keep getting stuck on a c++17 issue.

I have

CUDA=/usr/local/cuda-12.1

# A freshly downloaded 6.0.7.18
FSLDIR=/mnt/isilon_lts/m101013/test/fsl/fsl

FSLDEVDIR=$FSLDIR
FSLCONFDIR=${FSLDIR}/config
gcc/++ version 8.5.0

The base SPMIC Makefile exits with 8 errors like:
/mnt/isilon_lts/m101013/test/fsl/fsl/include/newimage/newimagefns.h(951): error: expected an identifier
auto [x, y, z] = xyz;
...which Google Gemini says I need c++17 and g++ > 7 for. As far as I can tell FSLCONFDIR sets --std=c++17 for g++ but not nvcc.

I tried the Makefile in this PR and got a bit farther, but not very:

15933:adir02:…forked:1$ export CUDA_STATIC=1
15934:adir02:…forked:1$ echo $GENCODEFLAGS

15935:adir02:…forked:1$ export GENCODEFLAGS="-gencode arch=compute_80,code=sm_80"
15936:adir02:…forked:1$ modelname=NODDI_Watson make
mkdir -p /mnt/isilon_lts/m101013/test/fsl/fsl/bin
mkdir -p objs
/usr/local/cuda/bin/nvcc -gencode arch=compute_80,code=sm_80 -ccbin g++ -isystem /usr/local/cuda-10.1/include -isystem /usr/local/cuda-10.1/targets/x86_64-linux/include --compiler-options "-isystem /mnt/isilon_lts/m101013/test/fsl/fsl/include -isystem /mnt/isilon_lts/m101013/test/fsl/fsl/include -I. -Imymodels/NODDI_Watson" --compiler-options "-fexpensive-optimizations -Wall -pedantic -Wno-long-long" --compiler-options "  " -DARMA_ALLOW_FAKE_GCC -std=c++17 --compiler-options "-fPIC -pthread" -maxrregcount 128 --compiler-options "" -c -o objs/modelparameters.o mymodels/NODDI_Watson/modelparameters.cc
nvcc fatal   : Value 'c++17' is not defined for option 'std'
make: *** [Makefile:87: objs/modelparameters.o] Error 1

Now nvcc is getting -std=c++17, but it is including from /usr/local/cuda-10.1 instead of 12.1. I think that problem is on my end, but if either of you could advise on fixing it, that would be great. My understanding is that CUDA < 12 is incompatible with c++17, which is why I was trying to limit the build to CUDA 12 with GENCODEFLAGS. I confess I do not know the CUDA version <-> sm, compute_xx mapping, though.

Thanks

@captainnova
Copy link

My build is including from /usr/local/cuda-10.1 because /usr/local/cuda is a symlink to it. Apparently $CUDA does not completely control the CUDA version. Our sysadmin is amenable to repointing it to /usr/local/cuda-12.1, but we have to wait a bit to see if any other users on the system object.

@captainnova
Copy link

Never mind, I fixed it - I had to

  1. remove /usr/local/cuda/bin from my $PATH
  2. Put ${CUDA}/bin at the start of PATH
  3. export LD_LIBRARY_PATH=${CUDA}/lib64

and I was able to compile. I happened to be using the Makefile from this PR. Sorry for the noise while I figured things out.

@pauldmccarthy
Copy link
Author

Hi @captainnova, glad to hear you got it working in the end. The FSL build system just requires whichever nvcc you wish to use to be at the front of your $PATH (which would be ${CUDA}/bin/ in your system).

In FSL 6.0.7.18 (the most recent release), I also made static linking the default, so if you are using the latest version of FSL you shouldn't need to set LD_LIBRARY_PATH, as the CUDA libraries will be statically linked into the executable.

So the CUDA_STATIC environment variable is now ignored; but if for whatever reason you would prefer to use dynamic linking, you can set CUDA_DYNAMIC=1.

For local development, you may want to use the GENCODEFLAGS variable to select a specific compute capability which is compatible with your hardware, as this will dramatically speed up complication times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants