PERF: avoid unnecessary array copy for single-pass convolution #169

neutrinoceros · 2025-07-06T10:44:49Z

Close #85
Incidentally, close #114
I checked that binary size doesn't noticibly increase with this refactor, lifting my comment from #85

neutrinoceros · 2025-07-06T11:01:28Z

This is stable. Before I undraft, squash and merge, I want to address two points:

document bound memory usage as a feature (update: this is DOC: document memory usage as a feature (peak and output) #175)
check if this impacts performance for single pass. A comment I left on PERF: avoiding an array copy for single-pass convolution #85 suggests that it should moderately reduce run time, but it's worth checking.

neutrinoceros · 2025-07-06T13:07:40Z

I'm actually seeing a 5% performance regression with this patch. I'll re-issue the various independent parts of the PR piecemeal to reduce my cognitive overload deciphering this.

neutrinoceros · 2025-07-06T14:18:41Z

I think I've been looking at this from the wrong angle; I was assuming that my convolve_once was somehow less efficient than convolve_iteratively, while in fact, I'm seeing 5% overhead with both implementations, so maybe the dispatching itself is generating overhead (pretty surprising), or there's something else I'm missing.

neutrinoceros · 2025-07-06T14:31:09Z

(dispatching at the Python level instead doesn't improve performance)

neutrinoceros · 2025-07-06T14:39:10Z

(forcing inlining on convolve_once and convolve_iteratively doesn't help either)

neutrinoceros · 2025-07-06T15:15:06Z

The good news is that #172 more than compensates for the loss in performance from this patch, but I still would like to figure out why this is slower than main.

neutrinoceros · 2025-07-21T16:04:39Z

rebased, but I'm still seeing about 2% (unexplained) overhead.

neutrinoceros added this to the Next release milestone Jul 6, 2025

neutrinoceros added refactor performance: runtime labels Jul 6, 2025

neutrinoceros force-pushed the perf/special-case-single-it branch 3 times, most recently from 431356a to 86938ae Compare July 6, 2025 10:57

neutrinoceros mentioned this pull request Jul 6, 2025

RFC: split UVMode::parse into its own function #170

Merged

neutrinoceros force-pushed the perf/special-case-single-it branch from bede295 to d261e9a Compare July 6, 2025 13:26

neutrinoceros mentioned this pull request Jul 6, 2025

RFC: simplify duplicated, internal logic for pixel selection #171

Merged

neutrinoceros force-pushed the perf/special-case-single-it branch 2 times, most recently from 7f9ce90 to d518f4f Compare July 6, 2025 14:08

neutrinoceros removed the refactor label Jul 6, 2025

neutrinoceros force-pushed the perf/special-case-single-it branch from d518f4f to 02a7c83 Compare July 6, 2025 15:12

neutrinoceros force-pushed the perf/special-case-single-it branch 2 times, most recently from 55c57d1 to 2df507c Compare July 7, 2025 15:25

neutrinoceros removed this from the Next release milestone Jul 7, 2025

neutrinoceros force-pushed the perf/special-case-single-it branch 2 times, most recently from 7b1779c to e471400 Compare July 8, 2025 10:25

neutrinoceros force-pushed the perf/special-case-single-it branch from e471400 to 63b4b16 Compare July 21, 2025 16:03

neutrinoceros added performance: memory usage and removed performance: runtime labels Nov 16, 2025

neutrinoceros added 2 commits November 18, 2025 09:17

RFC: implement convolve_once

953be57

ENH: plug in convolve_once

dfb4412

DOC: changelog

fb67ea9

neutrinoceros force-pushed the perf/special-case-single-it branch from 63b4b16 to fb67ea9 Compare November 18, 2025 08:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PERF: avoid unnecessary array copy for single-pass convolution #169

PERF: avoid unnecessary array copy for single-pass convolution #169

Uh oh!

neutrinoceros commented Jul 6, 2025

Uh oh!

neutrinoceros commented Jul 6, 2025 •

edited

Loading

Uh oh!

neutrinoceros commented Jul 6, 2025

Uh oh!

neutrinoceros commented Jul 6, 2025

Uh oh!

neutrinoceros commented Jul 6, 2025

Uh oh!

neutrinoceros commented Jul 6, 2025

Uh oh!

neutrinoceros commented Jul 6, 2025

Uh oh!

neutrinoceros commented Jul 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PERF: avoid unnecessary array copy for single-pass convolution #169

Are you sure you want to change the base?

PERF: avoid unnecessary array copy for single-pass convolution #169

Uh oh!

Conversation

neutrinoceros commented Jul 6, 2025

Uh oh!

neutrinoceros commented Jul 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

neutrinoceros commented Jul 6, 2025

Uh oh!

neutrinoceros commented Jul 6, 2025

Uh oh!

neutrinoceros commented Jul 6, 2025

Uh oh!

neutrinoceros commented Jul 6, 2025

Uh oh!

neutrinoceros commented Jul 6, 2025

Uh oh!

neutrinoceros commented Jul 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

neutrinoceros commented Jul 6, 2025 •

edited

Loading