Adding PyTorch functionality to SIRF #1305

Imraj-Singh · 2025-02-15T13:03:20Z

Changes in this pull request

The functionality:

Functions for transferring between torch and sirf, and vice versa.
torch.autograd.Function for objective function, acquisition mode forward, and acquisition model backward. These are not really meant to be used by a user.
torch.nn.Module for a module that interacts with the torch.autograd.Function. In the forward method we could change the number of the dimensions in the array, swap in/out components of the forward operator etc.

This functionality of torch.nn.Module can vary dependent on the user's requirements, so perhaps we should just give a use case. Using it for lpd for example, or PET reconstruction with torch optimisers?

Testing performed

Gradchecks for all the new functions. Note that objective function evaluations are quite unstable.

Related issues

Checklist before requesting a review

I have performed a self-review of my code
I have added docstrings/doxygen in line with the guidance in the developer guide
I have implemented unit tests that cover any new or modified functionality
The code builds and runs on my machine
CHANGES.md has been updated with any functionality change

Contribution Notes

Please read and adhere to the contribution guidelines.

Please tick the following:

The content of this Pull Request (the Contribution) is intentionally submitted for inclusion in SIRF (the Work) under the terms and conditions of the Apache-2.0 License.

KrisThielemans

Some detailed comments, but overall, I think we still need to decide on how to wrap AcquisitionModel, AcquisitionSensitivityModel, Resampler etc, i.e. anything that has forward and backward members. This should be written only once. You had https://github.com/SyneRBI/SIRF-Exercises/blob/71df538fe892ce621440544f14fa992c230fd120/notebooks/Deep_Learning_PET/sirf_torch.py#L38-L39, which seems completely generic, aside from naming of variables. So, something like this

class OperatorModule(torch.nn.Module):
    def __init__(self, sirf_operator, sirf_src, sirf_dest):
    """constructs wrapper. WARNING operations will overwrite content of `sirf_src` and `sirf_dest`"""
    ...

with usage

acq_model = pet.AcquisitionModelParallelProj()
# etc etc
torch_acq_model = sirf.torch.OperatorModule(acq_model, image, acq_data)

Ideally, we also provide for being able to call save_for_backward

src/torch/SIRF_torch.py

Imraj-Singh · 2025-02-18T20:25:34Z

Added a README with the some notes for a design discussion here.

KrisThielemans

I'd prefer not to have too much code in the design discussion (duplication as well as distracting from actual design). You could include pointers to implementation in the README, but it's still small enough that this probably isn't worth the effort.

Just keep a skeleton in the README, the shorter the better...

How do you want people to react to your questions in the README? Via a review? (not conventional, but could work)

src/torch/SIRF_torch.py

KrisThielemans · 2025-02-19T08:25:46Z

By the way, we should avoid running CI on this until it makes sense. Easiest is to put [ci skip] in the message of your last commit that you will push.

src/torch/SIRF_torch.py

src/torch/README.md

Imraj-Singh

I think the test_cases need reviewed again. Perhaps grad_check is unneccessary since SIRF usually does tests for adjointness etc. We could suffice with simpler test checking the things are copied to and from torch/sirf correctly?

src/torch/SIRF_torch.py

src/xSTIR/pSTIR/STIR.py

KrisThielemans

Various comments in the review. A general comment: I'm not convinced about hiding the "times -1" for the objective function. For one thing, it will fail with CIL functions such as KL. There are a few possible strategies here, which we should discuss.

src/common/SIRF.py

src/torch/SIRF_torch.py

src/torch/gradchecks.py

src/xSTIR/pSTIR/STIR.py

Imraj-Singh · 2025-03-03T16:57:23Z

src/torch/gradchecks.py

@paskino so this is an attempt at pytest

Now works with MR, I am getting some very sinister intermittant segfault with setting up the sirf.STIR objective function @KrisThielemans @paskino. I do think I saw this in the PETRIC challenge. Really not sure how to diagnose and it may be SIRF/PyTorch compatibility issues, but I really don't know.

src/torch/README.md

KrisThielemans

We seem to have a disagreement on how this should be documented. Ideally @mehrhardt @gschramm provide an alternative opinion to mine...

By the way, I think using funky characters in the doc is a bad idea and that it is better to use Latex-doc (although it is less readable of course). This is because you don't know what language settings the user has (unless there's a way to enforce unicode in Markdown).

src/torch/README.md

gschramm · 2025-03-04T07:49:05Z

gschramm

For me, this description of the autograd in pytorch makes sense:
https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#optional-reading-vector-calculus-using-autograd

This explain why, in the backward pass of a vector valued function y = f(x), we have to implement the
(Jacobian transpose) applied to a vector (supplied to the layer during back propagation).

If we write that out in components, we get the "chain rule".

KrisThielemans · 2025-03-04T11:22:37Z

thanks for the link @gschramm. We could maybe include it in the README. I think the main question for me is how much we should elaborate in our own doc, especially if it gets into detail on what is common usage vs what it actually does. I've put some detailed comments, and @Imraj-Singh will address those he agrees with :-). I'd personally prefer not too much (potentially confusing) detail in the README, but point to torch doc and our use cases (which could have a bit more doc then) or exercises.

src/torch/README.md

Imraj-Singh · 2025-03-04T17:27:49Z

src/torch/gradchecks.py

+
+
+    run_gradcheck(torch_obj_fun, torch_image_data, (modality, data_type), "Objective",
+                  nondet_tol=1e-1, fast_mode=True, eps=1e-3, atol=1e-1, rtol=1e-1)


@gschramm I had to set up the tolerance for non-determinism to be extremely high for objective function gradchecks... I think this may be troubles with computing the numerical Jacobian from a PNLL type function causing issues with numerical precision? I was wondering if you get this with your gradchecks in parallelproj. As a comparison I only wrapped the acquisition model and used torch's PNLL for the objective, and I still need rather high tolerances for non-determinism.

mehrhardt · 2025-03-05T18:40:31Z

src/torch/README.md

+## Forward and backward clarification
+
+The use of the terms forward and backward have different meaning given the context:
+* `torch.autograd.Function`: the `forward` method (forward pass) is the evaluation of the function, and `backward` method (backward pass) computes the (scaled directional) derivative, i.e. the Vector-Jacobian-adjoint-Product (VJP), from output to input.


Why do you refer to the derivatives as "scaled"?

The vector isn't neccessarily a unit vector. That said, I think I will just say "gradient" as that is correct and less confusing.

I don't agree that "gradient is correct". The backward method multiplies its argument with the gradient of the function, which is the VJP of course. I'd personally remove "from output to input", as this terminology only makes sense if you think about this is inserted as a layer in a feed-forward network.

Suggested change

* `torch.autograd.Function`: the `forward` method (forward pass) is the evaluation of the function, and `backward` method (backward pass) computes the (scaled directional) derivative, i.e. the Vector-Jacobian-adjoint-Product (VJP), from output to input.

* `torch.autograd.Function`: the `forward` method (forward pass) is the evaluation of the function, and `backward` method (backward pass) multiplies its argument with the gradient of the function, i.e. computes the Vector-Jacobian-adjoint-Product (VJP).

mehrhardt · 2025-03-05T18:42:39Z

src/torch/README.md

+
+The wrapper is currently only for **reverse-mode** autodiff - there are JVP methods (for forward-mode autodiff) of `torch.autograd.Function` that are not used at present.
+
+For `SIRFOperator` we can wrap the adjoint operator with `AdjointOperator`, there quite confusingly the forward is the application of the adjoint, and the backward *is* the linear (Jacobian) component/approximation of the operator.


backward is the linear (Jacobian) component/approximation of the operator

I don't know what this means

If we have some operator A(x) and it's adjoint B(x), the forward pass would be B(x), and the backward pass would be J_A(x) * grad_out. So for operator Ax+b, adjoint is A^T x and and backward pass is A grad_out? grad_out is pytorch for product of upstream gradients, i.e. upstream chain rule....

Perhaps I should change it to:

For SIRFOperator, wrapping its adjoint with AdjointOperator results in: forward executing the adjoint operation, and backward computing the Jacobian-based gradient of the original SIRFOperator.

I'm really not sure if this is helpful to mention at this point. It could be done in a use-case. But maybe we could say

Suggested change

For `SIRFOperator` we can wrap the adjoint operator with `AdjointOperator`, there quite confusingly the forward is the application of the adjoint, and the backward *is* the linear (Jacobian) component/approximation of the operator.

Note that in certain cases it can be necessary to use the adjoint of an operator as a forward step, e.g. when adding a gradient-descent step as a layer. This can be achieved by wrapping `sirf.AdjointOperator(sirf_operator)`.

src/torch/README.md

src/torch/use_cases.py

Co-authored-by: Kris Thielemans <[email protected]>

…ted as well as gradchecks tol decrease [ci skip]

- drop `object` typehints - enable torch cpu tests - misc formatting

[ci skip]

- with the current `import sirf.SIRF as sirf`, we should use `sirf.AdjointOperator` - cvorrect convBlock error [ci skip]

3.9.0 was released. [ci skip]

- fixes SyneRBI#1354

…ted as well as gradchecks tol decrease [ci skip]

- drop `object` typehints - enable torch cpu tests - misc formatting

[ci skip]

KrisThielemans

@Imraj-Singh not sure what you did. Probably a rebase gone somewhat wrong There seem to be various problems now. Some lines occur twice, all the stuff with ContiguousError is removed

CHANGES.md

KrisThielemans · 2025-11-17T23:46:06Z

cmake/sirf.__init__.py.in

+# import sirf.SIRF content into the `sirf` namespace for convenience
+from sirf.SIRF import *


fairly sure we had to remove this

KrisThielemans · 2025-11-17T23:46:19Z

src/common/SIRF.py

    ABC = abc.ABCMeta('ABC', (), {})


-def norm(x):


all of these lines are gone

src/torch/tests/gradchecks.py

Co-authored-by: Kris Thielemans <[email protected]>

Imraj-Singh · 2025-11-18T01:20:22Z

Hi @KrisThielemans, I thought this would be a relatively quick job, but I've had a lot of errors builiding. I am building on top of a pytorch image, with the dockerfile:

FROM pytorch/pytorch:2.6.0-cuda12.6-cudnn9-devel
RUN apt-get update

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install -y --no-install-recommends \
    git \
    wget \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

I attempt to install via SIRF-SuperBuild using:

cmake -S ./SIRF-SuperBuild -B ./build \
      -DSIRF_SOURCE_DIR=/home/user/sirf_env/SIRF \
      -DDISABLE_GIT_CHECKOUT_SIRF=ON \
      -DBUILD_SIRF=ON \
      -DBUILD_STIR=ON \
      -DBUILD_Gadgetron=OFF \
      -DBUILD_NIFTYREG=ON \
      -DBUILD_CIL=OFF \
      -DBUILD_ASTRA=OFF \
      -DBUILD_pet_rd_tools=OFF \
      -DBUILD_siemens_to_ismrmrd=OFF \
      -DUSE_SYSTEM_pugixml=ON \
      -DSIRF_EXTRA_CMAKE_ARGS="-DUSE_Gadgetron=OFF;-DUSE_NIFTYREG=ON;-DBUILD_SIRF_Registration=ON"

It built after making some changes outlined here.

I am getting intermittent errors when using the wrapped objective function:

Gradient Descent with Objective Function
double free or corruption (out)
Aborted (core dumped)

PET Varnet
Segmentation fault (core dumped)

malloc(): memory corruption (fast)
Aborted (core dumped)

On the occasion it does not crash I can tweak some of the use case parameters I get:

and:

All 2D PET grad tests passed, 3D PET grad tests take a very long time, and I haven't got gadgetron to work so I haven't tested.

The intermittent segfault is rather worrying... Any ideas? I think the pytorch is almost good to go, but after battling the build today I will try have another look tomorrow.

KrisThielemans · 2025-11-18T12:32:21Z

Sigh. build problems.......

You could have a look at SyneRBI/SIRF-SuperBuild#946 for how @casperdcl managed to get things to build.

The memory problems are obviously very worrying. Ideally you'd build SIRF and STIR with RelWithDebInfo and run it with valgrind. It'll throw up a lot of (false positives?) in python etc, but hopefully it will also tell us where it's happening. Sorry.

casperdcl · 2025-11-18T14:46:43Z

"double free" sounds related to wrapping pointers without handling ownership properly :/

casperdcl · 2025-11-20T10:56:34Z

Pre-rebase branch: https://github.com/SyneRBI/SIRF/tree/torch-prerebase
Diff: torch-prerebase...Imraj-Singh:SIRF:torch

Imraj-Singh marked this pull request as draft February 15, 2025 13:03

Imraj-Singh mentioned this pull request Feb 15, 2025

SIRF torch should be part of SIRF #1304

Open

KrisThielemans self-assigned this Feb 17, 2025

KrisThielemans linked an issue Feb 17, 2025 that may be closed by this pull request

SIRF torch should be part of SIRF #1304

Open

KrisThielemans requested changes Feb 18, 2025

View reviewed changes

KrisThielemans reviewed Feb 19, 2025

View reviewed changes

src/torch/SIRF_torch.py Outdated Show resolved Hide resolved

src/torch/SIRF_torch.py Outdated Show resolved Hide resolved

mehrhardt reviewed Feb 20, 2025

View reviewed changes

src/torch/SIRF_torch.py Outdated Show resolved Hide resolved

ckolbPTB reviewed Feb 21, 2025

View reviewed changes

KrisThielemans reviewed Feb 21, 2025

View reviewed changes

src/torch/README.md Outdated Show resolved Hide resolved

src/torch/README.md Outdated Show resolved Hide resolved

Imraj-Singh commented Feb 26, 2025

View reviewed changes

src/torch/SIRF_torch.py Outdated Show resolved Hide resolved

Imraj-Singh commented Feb 28, 2025

View reviewed changes

src/xSTIR/pSTIR/STIR.py Show resolved Hide resolved

KrisThielemans requested changes Mar 3, 2025

View reviewed changes

Imraj-Singh commented Mar 3, 2025

View reviewed changes

src/xSTIR/pSTIR/STIR.py Show resolved Hide resolved

Imraj-Singh commented Mar 3, 2025

View reviewed changes

src/torch/README.md Outdated Show resolved Hide resolved

Imraj-Singh requested a review from KrisThielemans March 3, 2025 18:09

KrisThielemans requested changes Mar 3, 2025

View reviewed changes

Imraj-Singh commented Mar 4, 2025

View reviewed changes

src/torch/README.md Show resolved Hide resolved

Imraj-Singh commented Mar 4, 2025

View reviewed changes

mehrhardt reviewed Mar 5, 2025

View reviewed changes

KrisThielemans reviewed Mar 6, 2025

View reviewed changes

src/torch/README.md Outdated Show resolved Hide resolved

KrisThielemans reviewed Mar 6, 2025

View reviewed changes

src/torch/README.md Outdated Show resolved Hide resolved

KrisThielemans reviewed Mar 6, 2025

View reviewed changes

src/torch/use_cases.py Outdated Show resolved Hide resolved

Imraj-Singh and others added 16 commits November 17, 2025 19:38

Apply suggestions from code review [ci skip]

a9a84c6

Co-authored-by: Kris Thielemans <[email protected]>

take away nonsense from __init__ and update README [ci skip]

14fb816

Updating use cases [ci skip]

0fb9c0e

requires_grad=False for tensors returned by backward, changes.md upda…

b6aaeeb

…ted as well as gradchecks tol decrease [ci skip]

add image to gd use_case [ci skip]

086baaa

misc review fixups

f63a168

- drop `object` typehints - enable torch cpu tests - misc formatting

Add once_differentiable decorator [ci skip]

a87a2de

codacity changes [ci skip]

018cb78

codacy changes [ci skip]

e8ee24f

run ci things

bb01691

Temp check for CI

20adf82

correct typo in CHANGES.md

9e06dea

[ci skip]

revert sirf convenience

d9d4098

Remove conflict marker in CHANGES.md

a153a10

[ci skip]

minor corrections

554d96c

- with the current `import sirf.SIRF as sirf`, we should use `sirf.AdjointOperator` - cvorrect convBlock error [ci skip]

move release notes to more recent version

e9ab032

3.9.0 was released. [ci skip]

Imraj-Singh force-pushed the torch branch from 7c29a5c to e9ab032 Compare November 17, 2025 19:49

KrisThielemans and others added 6 commits November 17, 2025 19:57

updated CITATION.cff and mailmap (SyneRBI#1354)

9043648

fix CITATION.cff

1208f93

- fixes SyneRBI#1354

requires_grad=False for tensors returned by backward, changes.md upda…

3739f40

…ted as well as gradchecks tol decrease [ci skip]

misc review fixups

5470505

- drop `object` typehints - enable torch cpu tests - misc formatting

correct typo in CHANGES.md

cc93ea9

[ci skip]

Remove conflict marker in CHANGES.md

2bd336e

[ci skip]

KrisThielemans reviewed Nov 17, 2025

View reviewed changes

Imraj-Singh and others added 2 commits November 17, 2025 19:36

Apply suggestions from code review

14739a0

Co-authored-by: Kris Thielemans <[email protected]>

reverting back, error in rebase

a4bb18e



		run_gradcheck(torch_obj_fun, torch_image_data, (modality, data_type), "Objective",
		nondet_tol=1e-1, fast_mode=True, eps=1e-3, atol=1e-1, rtol=1e-1)

	* `torch.autograd.Function`: the `forward` method (forward pass) is the evaluation of the function, and `backward` method (backward pass) computes the (scaled directional) derivative, i.e. the Vector-Jacobian-adjoint-Product (VJP), from output to input.
	* `torch.autograd.Function`: the `forward` method (forward pass) is the evaluation of the function, and `backward` method (backward pass) multiplies its argument with the gradient of the function, i.e. computes the Vector-Jacobian-adjoint-Product (VJP).


		The wrapper is currently only for reverse-mode autodiff - there are JVP methods (for forward-mode autodiff) of `torch.autograd.Function` that are not used at present.

		For `SIRFOperator` we can wrap the adjoint operator with `AdjointOperator`, there quite confusingly the forward is the application of the adjoint, and the backward is the linear (Jacobian) component/approximation of the operator.

	For `SIRFOperator` we can wrap the adjoint operator with `AdjointOperator`, there quite confusingly the forward is the application of the adjoint, and the backward is the linear (Jacobian) component/approximation of the operator.
	Note that in certain cases it can be necessary to use the adjoint of an operator as a forward step, e.g. when adding a gradient-descent step as a layer. This can be achieved by wrapping `sirf.AdjointOperator(sirf_operator)`.

		# import sirf.SIRF content into the `sirf` namespace for convenience
		from sirf.SIRF import * No newline at end of file

Adding PyTorch functionality to SIRF #1305

Are you sure you want to change the base?

Adding PyTorch functionality to SIRF #1305

Uh oh!

Conversation

Imraj-Singh commented Feb 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes in this pull request

Testing performed

Related issues

Checklist before requesting a review

Contribution Notes

Uh oh!

KrisThielemans left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Imraj-Singh commented Feb 18, 2025

Uh oh!

KrisThielemans left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

KrisThielemans commented Feb 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Imraj-Singh left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KrisThielemans left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

KrisThielemans left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Imraj-Singh commented Feb 15, 2025 •

edited

Loading

Imraj-Singh left a comment •

edited

Loading

gschramm commented Mar 4, 2025 •

edited

Loading

Imraj-Singh Mar 4, 2025 •

edited

Loading

Imraj-Singh Mar 5, 2025 •

edited

Loading

Imraj-Singh commented Nov 18, 2025 •

edited

Loading

casperdcl commented Nov 20, 2025 •

edited

Loading