Skip to content

Conversation

@onurulgen
Copy link
Member

  • Reinstated GPU into the project by fixing or implementing missing functions.
  • Rearchitected the project to combine GPU and CPU implementations by using virtual functions.
  • Increased the performance up to 11 times by using GPU.
  • Implemented unit and regression tests for GPU and CPU by using Catch2 framework.
  • Developed GitHub Actions for tests, coverage, static code analysis, and producing CPU and GPU executables for Windows, Linux and macOS.

- Ditch old texture objects and use up-to-date ones
- Make texture objects managed
- Ditch CUDA symbols and pass them as kernel function parameters
- Extend reg_updateControlPointPosition_gpu() to handle optimise* parameters
 - Rename CudaContextSingleton as CudaContext, and move it into NiftyReg namespace
 - Rename NiftyReg_CudaBlock100 as BlockSize, and move it into NiftyReg namespace
 - Move BlockSize implementation into the header
 - Change the type of BlockSize members as unsigned
 - Move BlockSize instance into CudaContext
 - Use unsigned instead of size_t in CUDA kernels
 - Initialise the CUDA or OpenCL device in Platform's constructor
 - Rename `unsigned int`s as `unsigned`
onurulgen added 17 commits May 13, 2025 14:51
…emove deprecated reg_tools_getMinValue/MaxValue functions
Comment on lines +5 to +61
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0

- name: Install Cppcheck
run: |
sudo apt-get update && sudo apt-get install -y libpcre3-dev
git clone -b 2.13.x https://github.com/danmar/cppcheck.git
cd cppcheck
# Disable color output of cppcheck
sed -i 's/ *bool *gDisableColors *= *false;/bool gDisableColors = true;/' lib/color.cpp
sudo make -j4 MATCHCOMPILER=yes FILESDIR=/usr/share/cppcheck HAVE_RULES=yes CXXFLAGS="-O2 -DNDEBUG -Wall -Wno-sign-compare -Wno-unused-function" install

- name: Install Python dependencies
run: pip3 install --upgrade setuptools urllib3 chardet pyOpenSSL pygithub

- name: Install CUDA Toolkit
uses: Jimver/cuda-toolkit@master
with:
cuda: '11.8.0'
method: network
use-github-cache: false
use-local-cache: false

- name: Configure NiftyReg
run: |
mkdir build && cd build
cmake -DCMAKE_C_COMPILER=gcc \
-DCMAKE_CXX_COMPILER=g++ \
-DCMAKE_BUILD_TYPE=Debug \
-DBUILD_ALL_DEP=ON \
-DCHECK_GPU=OFF \
-DUSE_CUDA=ON \
-DUSE_OPENCL=ON \
-DUSE_SSE=ON \
-DUSE_OPENMP=ON \
-DBUILD_TESTING=OFF \
-DCMAKE_EXPORT_COMPILE_COMMANDS=ON \
..

- name: Code Analysis
env:
COMMENT_TITLE: Code Analysis Results
GITHUB_TOKEN: ${{ github.token }}
PR_NUMBER: ${{ github.event.pull_request.number }}
REPO: ${{ github.repository }}
REPORT_PR_CHANGES_ONLY: false
run: |
analysis_file="analysis.txt"
cppcheck_params="--enable=warning --check-level=exhaustive --inline-suppr --suppress=internalError --suppress=internalAstError --suppress=*:*third-party/Eigen/*"
cppcheck -j4 $cppcheck_params --project=$(pwd)/build/compile_commands.json --output-file=$analysis_file
# Since cppcheck does not support OpenCL and CUDA, we need to check these files separately
find $(pwd)/reg-lib/cl/. -name "*.cl" -print0 | while IFS= read -r -d '' file; do cppcheck "$file" $cppcheck_params --language=c++ 2>> $analysis_file; done
find $(pwd)/reg-lib/cuda/. -name "*.cu" -print0 | while IFS= read -r -d '' file; do cppcheck "$file" $cppcheck_params --language=c++ 2>> $analysis_file; done
python3 .github/code_analysis.py -cc $analysis_file -o ${{ github.event_name == 'push' }} -fk false No newline at end of file

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 9 months ago

To fix the issue, we will add a permissions block at the root level of the workflow. This block will restrict permissions to contents: read, which is sufficient for most workflows that do not require write access. If the workflow requires additional permissions in the future, they can be added explicitly and scoped to specific jobs.

The permissions block will be added immediately after the name field (line 1) to apply to all jobs in the workflow.


Suggested changeset 1
.github/workflows/analysis.yml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/analysis.yml b/.github/workflows/analysis.yml
--- a/.github/workflows/analysis.yml
+++ b/.github/workflows/analysis.yml
@@ -1,2 +1,4 @@
 name: Code Analysis
+permissions:
+  contents: read
 on: [push, pull_request]
EOF
@@ -1,2 +1,4 @@
name: Code Analysis
permissions:
contents: read
on: [push, pull_request]
Copilot is powered by AI and may make mistakes. Always verify output.
Comment on lines +5 to +52
runs-on: ubuntu-22.04-gpu
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0

- name: Install dependencies
run: sudo apt-get update && sudo apt-get install -y cmake git lcov

- name: Install Catch2
run: |
git clone https://github.com/catchorg/Catch2.git
cd Catch2
cmake -Bbuild -H. -DBUILD_TESTING=OFF
sudo cmake --build build/ --target install --config Debug

- name: Configure NiftyReg
run: |
mkdir build && cd build
cmake -DCMAKE_C_COMPILER=gcc \
-DCMAKE_CXX_COMPILER=g++ \
-DCMAKE_BUILD_TYPE=Debug \
-DBUILD_ALL_DEP=ON \
-DUSE_CUDA=ON \
-DUSE_OPENCL=OFF \
-DUSE_SSE=OFF \
-DUSE_OPENMP=OFF \
-DBUILD_TESTING=ON \
-DWITH_COVERAGE=ON \
..

- name: Build NiftyReg
run: cmake --build build --config Debug

- name: Run tests
working-directory: build
run: ctest -V

- name: Coverage
working-directory: build
run: make coverage

- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
directory: build
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }} No newline at end of file

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 9 months ago

To fix the issue, we will add a permissions block at the root level of the workflow file. This block will specify the minimum permissions required for the workflow to function correctly. Based on the current workflow steps, the workflow only needs to read repository contents and upload coverage reports to Codecov. Therefore, we will set contents: read and id-token: write (required for the Codecov action). This ensures that no unnecessary write permissions are granted.


Suggested changeset 1
.github/workflows/coverage.yml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/coverage.yml b/.github/workflows/coverage.yml
--- a/.github/workflows/coverage.yml
+++ b/.github/workflows/coverage.yml
@@ -2,2 +2,5 @@
 on: [push, pull_request]
+permissions:
+  contents: read
+  id-token: write
 jobs:
EOF
@@ -2,2 +2,5 @@
on: [push, pull_request]
permissions:
contents: read
id-token: write
jobs:
Copilot is powered by AI and may make mistakes. Always verify output.
Comment on lines +5 to +62
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
include:
- sudo: "sudo"
c-compiler: "gcc"
cxx-compiler: "g++"
cmake-extra: ""
build-all-dep: "ON"
build-type: "Debug"
use-sse: "ON"
- os: macos-latest
build-all-dep: "OFF"
use-sse: "OFF"
- os: windows-latest
sudo: ""
c-compiler: "clang-cl"
cxx-compiler: "clang-cl"
cmake-extra: "-G Ninja"

steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0

- name: Install Catch2
shell: bash
run: |
git clone https://github.com/catchorg/Catch2.git
cd Catch2
cmake -Bbuild -H. -DBUILD_TESTING=OFF
${{ matrix.sudo }} cmake --build build/ --target install --config ${{ matrix.build-type }}

- name: Configure NiftyReg
shell: bash
run: |
mkdir build && cd build
cmake -DCMAKE_C_COMPILER=${{ matrix.c-compiler }} \
-DCMAKE_CXX_COMPILER=${{ matrix.cxx-compiler }} \
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }} \
-DBUILD_ALL_DEP=${{ matrix.build-all-dep }} \
-DUSE_CUDA=OFF \
-DUSE_OPENCL=OFF \
-DUSE_SSE=${{ matrix.use-sse }} \
-DUSE_OPENMP=ON \
-DBUILD_TESTING=ON \
${{ matrix.cmake-extra }} \
..

- name: Build NiftyReg
shell: bash
run: cmake --build build --config ${{ matrix.build-type }}

- name: Run tests
shell: bash
working-directory: build
run: ctest -V No newline at end of file

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium test

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 9 months ago

To fix the issue, we will add a permissions block at the root of the workflow file. This block will specify the minimal permissions required for the workflow to function. Since the workflow only checks out the repository and runs tests, it likely only needs contents: read permissions. This ensures that the workflow cannot perform any unintended write operations.

The permissions block will be added at the top level of the workflow, just below the name field, so it applies to all jobs in the workflow.


Suggested changeset 1
.github/workflows/tests.yml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@@ -1,2 +1,4 @@
 name: Tests
+permissions:
+  contents: read
 on: [push, pull_request]
EOF
@@ -1,2 +1,4 @@
name: Tests
permissions:
contents: read
on: [push, pull_request]
Copilot is powered by AI and may make mistakes. Always verify output.
@onurulgen onurulgen force-pushed the rearchitect-and-reinstate-gpu branch from 7c3d2a2 to dbce3ca Compare May 23, 2025 15:16
@kaczmarj
Copy link

kaczmarj commented Oct 30, 2025

hi @onurulgen - thank you for your tremendous effort in getting CUDA support back into niftyreg for a variety of realignment methods.

is this branch OK to try out? i am currently using version 1.3.9 from 2012 for CUDA-enabled reg_f3d.

@onurulgen
Copy link
Member Author

hi @onurulgen - thank you for your tremendous effort in getting CUDA support back into niftyreg for a variety of realignment methods.

is this branch OK to try out? i am currently using version 1.3.9 from 2012 for CUDA-enabled reg_f3d.

Yes, it's stable, you may go ahead and test it. I'd like to hear how it goes as well. Could you please give some feedback if you can?

Switches deprecated Windows runner from 2019 to 2022. Updates CMake flags for Windows and macOS builds to fix compilation errors.
Comment on lines +9 to +130
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-22.04, macos-13, macos-14, windows-2022]
platform: [cpu, cuda]
include:
- sudo: "sudo"
build-all-dep: "ON"
build-type: "Release"
cmake-extra: ""
use-sse: "ON"
- platform: cpu
platform-name: ""
use-cuda: "OFF"
- platform: cuda
platform-name: "-CUDA"
use-cuda: "ON"
- os: ubuntu-22.04
os-name: "Ubuntu"
c-compiler: "gcc-12"
cxx-compiler: "g++-12"
- os: macos-13
os-name: "macOS"
platform-name: "-Intel"
c-compiler: "gcc-14"
cxx-compiler: "g++-14"
build-all-dep: "OFF"
- os: macos-14
os-name: "macOS"
c-compiler: "gcc-14"
cxx-compiler: "g++-14"
cmake-extra: "-DCMAKE_C_FLAGS=-D__FLT_EVAL_METHOD__=0"
build-all-dep: "OFF"
use-sse: "OFF"
- os: windows-2022
os-name: "Windows"
sudo: ""
c-compiler: "clang-cl"
cxx-compiler: "clang-cl"
cmake-extra: "-DCMAKE_CUDA_FLAGS=\"-allow-unsupported-compiler -D_ALLOW_COMPILER_AND_STL_VERSION_MISMATCH\" -G Ninja"
exclude:
- os: macos-13
platform: cuda
- os: macos-14
platform: cuda

steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0

- name: Setup MSVC environment
uses: ilammy/msvc-dev-cmd@v1
if: matrix.os-name == 'Windows'

- name: Install CUDA Toolkit
uses: Jimver/cuda-toolkit@master
if: matrix.platform == 'cuda'
with:
cuda: '11.8.0'
method: network
use-github-cache: false
use-local-cache: false

- name: Configure NiftyReg
shell: bash
run: |
mkdir build && cd build
cmake -DCMAKE_C_COMPILER=${{ matrix.c-compiler }} \
-DCMAKE_CXX_COMPILER=${{ matrix.cxx-compiler }} \
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }} \
-DBUILD_ALL_DEP=${{ matrix.build-all-dep }} \
-DCHECK_GPU=OFF \
-DUSE_CUDA=${{ matrix.use-cuda }} \
-DUSE_OPENCL=OFF \
-DUSE_SSE=${{ matrix.use-sse }} \
-DUSE_OPENMP=ON \
-DBUILD_TESTING=OFF \
${{ matrix.cmake-extra }} \
..

- name: Build NiftyReg
shell: bash
run: cmake --build build --config ${{ matrix.build-type }}

- name: Prepare the variables
id: vars
shell: bash
run: echo "output-folder=NiftyReg-${{ matrix.os-name }}${{ matrix.platform-name }}-${GITHUB_REF#refs/tags/}" >> $GITHUB_OUTPUT

- name: Prepare the package
if: matrix.os-name == 'Windows'
shell: powershell
working-directory: build/reg-apps
run: |
New-Item -ItemType Directory -Force -Path ${{ steps.vars.outputs.output-folder }}
Move-Item -Path *.exe -Destination ${{ steps.vars.outputs.output-folder }}
Copy-Item -Path "C:/Program Files/LLVM/bin/libomp.dll" -Destination ${{ steps.vars.outputs.output-folder }}/libomp140.x86_64.dll
Compress-Archive -Path ${{ steps.vars.outputs.output-folder }} -DestinationPath ../NiftyReg.zip

- name: Prepare the package
if: matrix.os-name == 'Ubuntu'
working-directory: build/reg-apps
run: |
mkdir -p ${{ steps.vars.outputs.output-folder }}
find . -maxdepth 1 -type f -executable -exec mv {} ${{ steps.vars.outputs.output-folder }} \;
zip -r ../NiftyReg.zip ${{ steps.vars.outputs.output-folder }}

- name: Prepare the package
if: matrix.os-name == 'macOS'
working-directory: build/reg-apps
run: |
mkdir -p ${{ steps.vars.outputs.output-folder }}
find . -maxdepth 1 -type f -perm +111 -exec mv {} ${{ steps.vars.outputs.output-folder }} \;
zip -r ../NiftyReg.zip ${{ steps.vars.outputs.output-folder }}

- name: Upload the package
uses: svenstaro/upload-release-action@v2
with:
repo_token: ${{ github.token }}
file: build/NiftyReg.zip
asset_name: ${{ steps.vars.outputs.output-folder }}.zip No newline at end of file

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 3 months ago

To fix the problem, add a permissions section to the workflow. This can be done either globally at the root level, or per job (here, just the build job). For maximal clarity and future-proofing, it's best to add at the root, unless different jobs require different permissions. The workflow uploads release assets but does not modify repository contents; thus, the minimal required permissions should be contents: read, which allows reading release info but not writing to the repository. The upload-release-action uses repo_token—this is only for uploading assets to a release, which needs contents: read for release info, and requires contents: write to upload assets. To function correctly, contents: write may be needed; but to adhere to least privilege, we can set contents: write. So, add a block:

permissions:
  contents: write

directly under the name: Release line (before on:).

Suggested changeset 1
.github/workflows/release.yml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -1,4 +1,6 @@
 name: Release
+permissions:
+  contents: write
 
 on:
   release:
EOF
@@ -1,4 +1,6 @@
name: Release
permissions:
contents: write

on:
release:
Copilot is powered by AI and may make mistakes. Always verify output.
@onurulgen onurulgen merged commit ee6a92d into master Nov 21, 2025
14 of 15 checks passed
@onurulgen onurulgen deleted the rearchitect-and-reinstate-gpu branch November 21, 2025 09:59
@kaczmarj
Copy link

kaczmarj commented Nov 22, 2025

@onurulgen - niftyreg v2.0.0 is working great so far. thanks again for all of your work.

i compiled it in a docker container. i copied the dockerfile below if anyone is interested. i got errors with CUDA 12.x so i stuck with 11.8

FROM nvcr.io/nvidia/cuda:11.8.0-devel-ubuntu22.04 AS builder

ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y \
        build-essential \
        cmake \
        git \
        ninja-build \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /work
RUN git clone https://github.com/KCL-BMEIS/niftyreg . \
    && git checkout v2.0.0

RUN cmake -S . -B build -G "Ninja" \
        -DCMAKE_BUILD_TYPE=Release \
        -DBUILD_ALL_DEP=ON \
        -DUSE_CUDA=ON \
        -DCHECK_GPU=OFF \
        -DUSE_OPENMP=ON \
        -DUSE_SSE=ON \
        -DCMAKE_INSTALL_PREFIX=/opt/niftyreg \
        -DCMAKE_INSTALL_RPATH="\$ORIGIN/../lib" \
        # Support many different NVIDIA GPUs.
        -DCMAKE_CUDA_ARCHITECTURES="75-real;80-real;86-real;89-real;90" \
    && cmake --build build --parallel \
    && cmake --install build

FROM nvcr.io/nvidia/cuda:11.8.0-runtime-ubuntu22.04
COPY --from=builder /opt/niftyreg /opt/niftyreg
ENV PATH="/opt/niftyreg/bin:${PATH}"
WORKDIR /work

i built with this (i'm on an arm-based macbook)

docker buildx build --platform=linux/amd64 --load -t kaczmarj/niftyreg .

then I got the binaries out with this:

docker run --rm \
  -v "$(pwd)":/out \
  kaczmarj/niftyreg \
  bash -c "cd /opt && tar -czf /out/niftyreg.tar.gz niftyreg"

@onurulgen
Copy link
Member Author

Thank you! You may use precompiled binaries; and if you'd like to compile yourself, you may follow release.yml file. The code is compatible up to CUDA 12.4.x. It might've failed if you tried later versions.

@kaczmarj
Copy link

fantastic, i will use the precompiled binaries in that case. thanks kindly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rearchitect F3D to reinstate GPU

5 participants