Skip to content

Slowness seen using PiecewiseAffineTransform compared to scikit-image version #698

Open
@JHancox

Description

@JHancox

Describe the bug
The cucim.skimage.transform.PiecewiseAffineTransform seems to be several times slower than the scikit-image equivalent

Steps/Code to reproduce bug
When running the code below, I observe a 8x slowdown for the estimate and 2x slowdown for the warp operations using the PyTorch 24.01 container with cucim 23.12

Expected behavior
The code should execute at least as fast as the cpu version

Environment details (please complete the following information):
Docker on Ubuntu 22.04
PyTorch 24.01 container with scikit-image and cucim 23.12 pip installed

Additional context

`import matplotlib.pyplot as plt
from skimage.transform import PiecewiseAffineTransform, warp
from scipy.interpolate import LinearNDInterpolator
import numpy as np
from timeit import default_timer as timer
from cucim.skimage.transform import PiecewiseAffineTransform as cu_PAT
from cucim.skimage.transform import warp as cu_warp
import cupy as cp
   
# create some offsets and coordinates
vectors = np.array([[3.0,1.0],[-5.,-1.3],[-3.5,8.3],[0,0],[0,0],[0,0], [0,0]])
coords = np.array([[20,20],[180,50],[20, 180],[0,0],[0,255],[255,0], [255,255]])

# Create grid
step_size = 20
x = np.linspace(0, 255, num=step_size)
y = np.linspace(0, 255, num=step_size)
X, Y = np.meshgrid(x, y)

interpx = LinearNDInterpolator(list(coords), vectors[:,0])
Zxi = interpx(Y, X)

interpy = LinearNDInterpolator(list(coords), vectors[:,1])
Zyi = interpy(Y, X)

# create an array of coords
src = np.column_stack((X.reshape(-1), Y.reshape(-1)))

# add the interpolated offets
dst_rows = X + Zxi
dst_cols = Y + Zyi

dst = np.column_stack([dst_cols.reshape(-1), dst_rows.reshape(-1)])

# compute transforms
tform = PiecewiseAffineTransform()

start = timer()
tform.estimate(src, dst)
print("cpu estimate took {}s".format(timer()-start))

start = timer()
out = warp(imgrid, tform, output_shape=(255, 255))
print("cpu warp took {}s".format(timer()-start))

# repeat using cupy/cucim.skimage
cu_tform = cu_PAT()
start = timer()
cu_tform.estimate(cp.array(src), cp.array(dst))
print("gpu estimate took {}s".format(timer()-start))

start = timer()
out = cu_warp(cp.array(imgrid), cu_tform, output_shape=(255, 255))
print("gpu warp took {}s".format(timer()-start))
`

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions