Add tile-cut stitching follow-up to calculate_tiling_qc by timtreis · Pull Request #1170 · scverse/squidpy

timtreis · 2026-05-08T18:05:13Z

Summary

Builds on #1157. Adds a follow-up pass that recovers cells which segmentation tiling broke into 2-4 pieces by detecting facing cut edges across tile boundaries and assigning each candidate pair a transparent geometric score. Worst case (4-tile corner) is handled.

Two public functions:

squidpy.experimental.tl.stitch_tile_cuts(sdata, labels_key, ...) -- reads is_outlier=True cells from the QC table, extracts cut-edge candidates via bbox-edge alignment, scores each pair with a transparent geometric composite (see below), and assembles confident pairs into 2-4-piece groups via union-find with corner-junction validation. Writes 4 .obs columns to the existing QC table -- stitch_group_id, is_stitched, n_pieces, stitch_confidence -- plus a .uns["tiling_stitch"] audit trail (params, score formula, run summary). The labels element is never mutated.
squidpy.experimental.im.make_stitched_labels(sdata, labels_key, ..., merge_strategy="sum", inplace=True) -- opt-in materialisation of a stitched labels element via a lazy dask LUT, plus a collapsed AnnData with one row per unique stitch_group_id (unstitched cells pass through unchanged, stitched groups collapse). Numeric .obs columns and .X aggregate via merge_strategy (sum/min/max/mean/median/first or callable); group-invariant + non-numeric columns always take "first". Preserves .uns, .var, and any user-added obs columns.

calculate_tiling_qc re-runs now warn and drop stale stitch columns when the QC table is overwritten.

How `stitch_confidence` is computed

For each candidate pair the four geometric / shape-quality features below are averaged into a single score in [0, 1]. No coefficients are fitted or shipped -- the formula is the entire model and is recorded in .uns["tiling_stitch"]["score_formula"].

feature	what it captures	range
`iou`	1-D intersection-over-union of the two cut-edge extents along the boundary	[0, 1]
`endpoint_match`	how closely the chord endpoints coincide -- true cuts share endpoints, unrelated cells don't	[0, 1]
`merge_compactness`	`4piA / P^2` of the union mask after morphologically closing the seam gap. Real cells are reasonably compact; false merges produce weird perimeters	[0, 1]
`merge_solidity`	union mask area / convex hull area. Real cells are convex-ish; false merges have concave joins	[0, 1]

stitch_confidence = (iou + endpoint_match + merge_compactness + merge_solidity) / 4

A gap_score is also computed but only used as a hard filter (already inside max_gap by construction); it does not enter the score. The two merge_* features are computed by materialising a tight crop around the union of the candidate pieces, closing the gap with a disk(3) structuring element, and running regionprops on the largest connected component.

min_confidence is a threshold on this mean; 0.7 is the default starting point. Tune for your data -- the score is heuristic, dataset-independent, and not a calibrated probability. Review false positives / negatives via make_stitched_labels and the visual test fixture, then adjust.

Adds two public functions building on the tile-boundary QC outliers: - squidpy.experimental.tl.stitch_tile_cuts: pairs facing cut edges across tile boundaries (bbox-edge alignment + IoU + endpoint match), scores each pair via a frozen L2 logistic regression on geometric + shape-quality features (merge_solidity, merge_compactness), and assembles confident pairs into 2-4-piece groups via union-find with corner-junction validation. Writes 4 .obs columns to the existing QC table (stitch_group_id, is_stitched, n_pieces, stitch_confidence) plus a .uns['tiling_stitch'] audit trail. Labels element is never modified. - squidpy.experimental.im.make_stitched_labels: opt-in materialisation of a stitched labels element via a lazy dask LUT, plus a collapsed AnnData with one row per unique stitch_group_id. Numeric .obs columns and .X aggregate via merge_strategy (sum/min/max/mean/median/first or callable, default sum); group-invariant columns and non-numerics use first. Preserves the QC table's .uns and any user-added .obs columns. Also: - calculate_tiling_qc now warns and drops stale stitch columns when the QC table is overwritten on re-run. - Frozen logistic-regression coefficients trained on 2197 synthetic pairs across 50 scenarios; 5-fold CV Brier 0.025; cross-scenario precision 0.93+ at threshold 0.9 on held-out dense data. Tests: 24 new unit tests covering cut-edge contracts, .obs/.uns/X preservation, merge strategies (str + callable), corner-junction validation, group-invariant column handling, idempotency, error paths, and end-to-end QC->stitch->remap flow on the existing fixture. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…n, multi-scale - Extract resolve_labels_array as a shared helper in _tiling_qc.py; both stitch_tile_cuts and make_stitched_labels import it instead of carrying near-duplicate inline copies. - Add return type annotations on stitch_tile_cuts (-> AnnData | None) and make_stitched_labels (-> dict | None). - Drop the plans/ reference in stitch_tile_cuts comment; replace with a self-contained explanation of the three .obs states. - Hoist group_sizes definition out of its conditional so the later reference is unconditionally defined (was relying on short-circuit). - Convert n_pieces_distribution dict keys to str so .uns round-trips cleanly through zarr. - Vectorise _aggregate_X for built-in strategies (sum/min/max/mean/median/ first) using axis=0 numpy reductions; callable strategies still go through the per-column pd.Series fallback. - Validate label_id / stitch_group_id fit in the labels' integer dtype in _build_lookup; raise ValueError instead of silent truncation. - Document group-invariant column handling in make_stitched_labels' public docstring. - Warn when QC-flagged outlier label_ids are missing from the labels element (previously silent skip). - Add inplace=False to make_stitched_labels: returns {"labels": ..., "table": ...} without mutating sdata. - TODO note in _compute_outlier_bboxes for the pre-mask-with-isin optimisation when outliers are sparse. Tests: add multi-scale unit + end-to-end coverage for resolve_labels_array and the QC -> stitch -> make_stitched_labels chain via Labels2DModel.parse with scale_factors=[2]. Add inplace=False tests for make_stitched_labels. 76 tests pass; ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Side-by-side panels of a hardcoded 100x100 px crop centred on the first horizontal tile seam (y=200) of the existing tile-boundary fixture. Cells share a stable random-colour palette across panels, so split cells appear as two different colours in "Before" and unify into one colour in "After". Dashed white line marks the seam. The baseline lives at tests/_images/StitchVisual_seam_before_after.png and is downloaded from CI artifacts (per project convention). The test fails locally without it; passes once the baseline is in place. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Previous version baked in logistic-regression weights fit on synthetic disks. Wrong contract: those weights claim a calibration they can't honour on real data, and we don't want squidpy to ship a model that silently encodes a synthetic distribution. Replace with an explicit formula: stitch_confidence = mean(iou, endpoint_match, merge_compactness, merge_solidity) All four features are dataset-independent geometry / shape signals in [0, 1]. No fitting, no shipped weights. Default min_confidence drops from 0.9 to 0.7 to match the new score's distribution; users tune for their data. .uns["tiling_stitch"] now records score_features + score_formula instead of model_version / model_coefficients / model_intercept. Drop plans/prototype_tiling_stitch.py (untracked scratch script that trained the now-removed coefficients). Tests updated; 76 pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ater) Locally rendered placeholder for TestStitchVisual::test_plot_seam_before_after. The repo convention is platform-correct baselines downloaded from CI visual_test_results artifacts; this branch can't get one until either #1157 merges to main or test.yaml grows a workflow_dispatch trigger. Once CI runs against this branch, overwrite this PNG with the artifact version. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…onents By default make_stitched_labels remaps both pieces of a stitched cell to the same ID, leaving the cut stripe between them at 0 (background) -- so the result is a single label across multiple disconnected components. Some downstream tools (naive contour walks, polygon exporters) expect one-label-one-component and miscount. Add join_labels=False (default) for the existing behaviour, join_labels=True to fill the gap. When True, single-pass regionprops finds each stitched group's bbox; binary_closing(disk(close_radius)) on the group mask; newly-closed pixels are written back only when they were 0 (background) so other cells are never overwritten. Forces materialisation of the labels array; cost is bounded by stitched-group bbox count. Tests: connected-component count is >1 for some group when join=False and exactly 1 for every group when join=True; non-stitched cells' pixels are byte-identical before/after joining. Visual: TestStitchVisual::test_plot_seam_join_labels -- side-by-side zoom showing the seam stripe filled when join_labels=True. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

timtreis and others added 6 commits May 8, 2026 19:20

timtreis requested a review from selmanozleyen May 12, 2026 10:07

Merge branch 'feature/tiling-qc-v2' into feature/tiling-stitch

749e3c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tile-cut stitching follow-up to calculate_tiling_qc#1170

Add tile-cut stitching follow-up to calculate_tiling_qc#1170
timtreis wants to merge 7 commits into
feature/tiling-qc-v2from
feature/tiling-stitch

timtreis commented May 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

timtreis commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How stitch_confidence is computed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

timtreis commented May 8, 2026 •

edited

Loading

How `stitch_confidence` is computed