perf(detection): count mask pixels with count_nonzero by RubenHaisma · Pull Request #2361 · roboflow/supervision

RubenHaisma · 2026-06-27T13:24:23Z

Description

Mask pixel-area counting used np.sum: np.array([np.sum(m) for m in masks]) in Detections.area, and np.sum(mask, axis=(1, 2)) in the metrics helper get_mask_size_category.

For boolean masks, np.count_nonzero (with no axis) dispatches to NumPy's SIMD popcount over the raw byte buffer, whereas every axis-reduction form — np.sum(..., axis=...) and even np.count_nonzero(..., axis=...) — falls back to a slower generic reduction. So counting per-mask with np.count_nonzero is several times faster than the "obvious" vectorized sum, while producing bit-identical integer counts. (The plain masks.sum(axis=(1, 2)) vectorization, by contrast, is only ~1.1x — it's the primitive, not the loop-removal, that matters here.)

Both sites now use:

np.fromiter((np.count_nonzero(m) for m in masks), dtype=np.int64, count=len(masks))

dtype=np.int64 preserves the documented Detections.area mask-branch dtype on every platform (a bare np.array([...]) of Python ints is int32 on Windows).

Performance

~5x on 640×640 masks, density-independent:

path	N	before	after	speedup
`Detections.area`	100	7.8 ms	1.5 ms	5.3×
`Detections.area`	300	24.5 ms	4.6 ms	5.3×
`get_mask_size_category`	300	~24 ms	~4.4 ms	5.5×

get_mask_size_category feeds the size-bucketed F1Score / Precision / Recall / MeanAveragePrecision / MeanAverageRecall metrics, where it runs repeatedly (SMALL/MEDIUM/LARGE × predictions+targets) across a dataset.

Correctness

Counting True pixels equals summing a bool array exactly (integers, no float error). Verified bit-identical over 400 randomized trials plus empty / all-true / all-false / 1×1 edge cases, with int64 dtype preserved. Existing Detections.area and metrics suites pass unchanged.

Tests

Adds parity tests for Detections.area (dense mask) and get_mask_size_category, each against an np.sum reference, plus threshold-boundary coverage. ruff check/format clean.

Mask pixel-area counting used `np.sum` — `np.array([np.sum(m) for m in masks])` in `Detections.area` and `np.sum(mask, axis=(1, 2))` in the metrics `get_mask_size_category`. For boolean masks `np.count_nonzero` (with no axis) dispatches to NumPy's SIMD popcount over the raw byte buffer, whereas every axis-reduction form — `np.sum(..., axis=...)` and even `np.count_nonzero(..., axis=...)` — falls back to a slower generic reduction. So counting per mask with `np.count_nonzero` is several times faster than the "obvious" vectorized sum, while producing bit-identical integer counts. Route both sites through `np.fromiter((np.count_nonzero(m) for m in masks), dtype=np.int64, count=len(masks))`. `dtype=np.int64` preserves the documented `Detections.area` mask-branch dtype on every platform (a bare `np.array([...])` of Python ints would be int32 on Windows). Measured ~5x on 640x640 masks (e.g. `Detections.area`, N=300: ~24ms -> ~4ms), faster across densities. `get_mask_size_category` feeds the size-bucketed F1/Precision/Recall/mAP/mAR metrics, where it is invoked repeatedly per dataset. Counts are integer-exact (verified over 400 randomized trials plus empty / all-true / all-false / 1x1 edge cases). Adds parity tests for `Detections.area` (dense mask) and `get_mask_size_category` against an `np.sum` reference.

codecov · 2026-06-27T13:25:16Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82%. Comparing base (09b2199) to head (ec2e872).

❌ Your project check has failed because the head coverage (82%) is below the target coverage (95%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@           Coverage Diff           @@
##           develop   #2361   +/-   ##
=======================================
  Coverage       82%     82%           
=======================================
  Files           68      68           
  Lines         9560    9560           
=======================================
  Hits          7881    7881           
  Misses        1679    1679

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

RubenHaisma requested a review from SkalskiP as a code owner June 27, 2026 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(detection): count mask pixels with count_nonzero#2361

perf(detection): count mask pixels with count_nonzero#2361
RubenHaisma wants to merge 1 commit into
roboflow:developfrom
RubenHaisma:perf/mask-area-count-nonzero

RubenHaisma commented Jun 27, 2026

Uh oh!

codecov Bot commented Jun 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

RubenHaisma commented Jun 27, 2026

Description

Performance

Correctness

Tests

Uh oh!

codecov Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented Jun 27, 2026 •

edited

Loading