Skip to content

fix: raise ValueError when combat batch has fewer than 2 cells#3994

Merged
flying-sheep merged 3 commits into
scverse:mainfrom
LiudengZhang:fix/combat-single-cell-batch
Mar 14, 2026
Merged

fix: raise ValueError when combat batch has fewer than 2 cells#3994
flying-sheep merged 3 commits into
scverse:mainfrom
LiudengZhang:fix/combat-single-cell-batch

Conversation

@LiudengZhang
Copy link
Copy Markdown
Contributor

What

Raise a ValueError when sc.pp.combat receives a batch with fewer than 2 cells.

Why

ComBat estimates within-batch variance, which requires at least 2 observations.
When a batch contains only 1 cell, the variance is zero, causing a divide-by-zero
that silently fills the entire corrected matrix with NaN. Downstream functions
like highly_variable_genes then fail with cryptic errors.

How

Added validation after batch grouping that checks all batch sizes ≥ 2. Raises a
clear error listing the offending batch names, so users know which batches to
filter before running combat.

Testing

Added test_combat_single_cell_batch — creates a dataset with a single-cell
batch and confirms the ValueError is raised. All 5 existing combat tests
continue to pass.

Closes #1175

Previously, `sc.pp.combat` would silently produce NaN values when a
batch contained only 1 cell, because within-batch variance cannot be
estimated from a single observation. This adds input validation to
raise a clear error before computation begins.

Closes scverse#1175
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.97%. Comparing base (ee7cd17) to head (99a6e23).
⚠️ Report is 3 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3994      +/-   ##
==========================================
+ Coverage   77.96%   77.97%   +0.01%     
==========================================
  Files         118      118              
  Lines       12517    12523       +6     
==========================================
+ Hits         9759     9765       +6     
  Misses       2758     2758              
Flag Coverage Δ
hatch-test.low-vers 77.25% <100.00%> (+0.01%) ⬆️
hatch-test.pre 76.91% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/scanpy/preprocessing/_combat.py 88.72% <100.00%> (+0.34%) ⬆️

... and 1 file with indirect coverage changes

@LiudengZhang
Copy link
Copy Markdown
Contributor Author

Hi, just checking if this PR is ready for review. CI is green and tests pass. Happy to address any feedback!

@flying-sheep flying-sheep added this to the 1.12.1 milestone Mar 13, 2026
@flying-sheep
Copy link
Copy Markdown
Member

OK, I simplified this a bit by reusing the groupby object, looks great!

@LiudengZhang
Copy link
Copy Markdown
Contributor Author

Thanks for the simplification — reusing the groupby dict is much cleaner!

@flying-sheep flying-sheep enabled auto-merge (squash) March 14, 2026 11:33
@flying-sheep flying-sheep merged commit 02fb2cc into scverse:main Mar 14, 2026
19 of 21 checks passed
meeseeksmachine pushed a commit to meeseeksmachine/scanpy that referenced this pull request Mar 14, 2026
flying-sheep pushed a commit that referenced this pull request Mar 14, 2026
… batch has fewer than 2 cells) (#4001)

Co-authored-by: LiudengZhang <99156394+LiudengZhang@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Combat will return all NaNs in adata.X if one of the batches contains only 1 cell

2 participants