Skip to content

inspect() performance: batch tolist() and reuse fg_mask #220

@perctrix

Description

@perctrix

Problem

inspect() in mipcandy/data/inspection.py has two unnecessary performance bottlenecks:

1. Per-element tolist() in class_locations (line 294)

class_locations[class_id] = [tuple(coord.tolist()[1:]) for coord in indices]

This iterates over up to 10,000 single-row tensors, each triggering a Python-C++ bridge call. Should be a single batch operation:

class_locations[class_id] = tuple(tuple(loc) for loc in indices[:, 1:].tolist())

2. Redundant label != background (line 275 vs 301)

The boolean mask is computed twice on the full label tensor:

indices = (label != background).nonzero()   # line 275
# ...
fg_mask = label != background               # line 301 (same operation)

Should compute once and reuse.

Benchmark

Tested on PH2 (200 2D samples) and BRaTS (368 3D volumes, ~240x240x155):

PH2

Variant Time Speedup
baseline 31.7s 1.00x
batch tolist 16.6s 1.91x
+ reuse mask 16.7s 1.90x

BRaTS

Variant Time Speedup
baseline 360s 1.00x
batch tolist 341s 1.06x
+ reuse mask 335s 1.08x

On 2D datasets, opt1 alone nearly halves execution time. On 3D datasets, I/O dominates but both optimizations still provide measurable improvement.

Notes

  • Streaming statistics (replacing torch.cat + np.percentile with online mean/std and reservoir sampling) was also benchmarked but showed no improvement (even slightly slower on BRaTS) and introduced approximation error in percentile values. Not worth pursuing.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request
No fields configured for Feature.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions