Skip to content

Commit 45325b1

Browse files
Intron7flying-sheepCopilotilaykavpre-commit-ci[bot]
authored
feat: add Harmony to scanpy (#3953)
Co-authored-by: Philipp A. <flying-sheep@web.de> Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Ilay Kavitzky <ilay.kavitzky@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Cuiwei Gao <48gaocuiwei@gmail.com> Co-authored-by: Jhonatan Felix <108437587+JhonatanFelix@users.noreply.github.com>
1 parent 20c1425 commit 45325b1

16 files changed

Lines changed: 1265 additions & 137 deletions

File tree

docs/api/preprocessing.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,18 +47,24 @@ For visual quality control, see {func}`~scanpy.pl.highest_expr_genes` and
4747
pp.recipe_seurat
4848
```
4949

50-
## Batch effect correction
50+
(pp-data-integration)=
5151

52-
Also see {ref}`data-integration`. Note that a simple batch correction method is available via {func}`pp.regress_out`. Checkout {mod}`scanpy.external` for more.
52+
## Data integration
53+
54+
Batch effect correction and other data integration.
55+
Note that a simple batch correction method is available via {func}`pp.regress_out`.
5356

5457
```{eval-rst}
5558
.. autosummary::
5659
:nosignatures:
5760
:toctree: generated/
5861
5962
pp.combat
63+
pp.harmony_integrate
6064
```
6165

66+
Also see {ref}`data integration tools <data-integration>` and external {ref}`external data integration <external-data-integration>`.
67+
6268
## Doublet detection
6369

6470
```{eval-rst}

docs/conf.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,7 @@
176176
"pp.downsample_counts": (["np", "sp[csr]"], []),
177177
"pp.filter_cells": (["np", "sp", "da"], []),
178178
"pp.filter_genes": (["np", "sp", "da"], []),
179+
"pp.harmony_integrate": (["np"], []),
179180
"pp.highly_variable_genes": (["np", "sp", "da"], ["da[sp[csc]]"]),
180181
"pp.log1p": (["np", "sp", "da"], []),
181182
"pp.neighbors": (["np", "sp"], []),

docs/external/preprocessing.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,11 @@
55
.. currentmodule:: scanpy.external
66
```
77

8+
Previously found here, but now part of scanpy’s main API:
9+
- {func}`scanpy.pp.harmony_integrate`
10+
- {func}`scanpy.pp.scrublet`
11+
- {func}`scanpy.pp.scrublet_simulate_doublets`
12+
813
(external-data-integration)=
914

1015
## Data integration
@@ -14,10 +19,8 @@
1419
:toctree: ../generated/
1520
1621
pp.bbknn
17-
pp.harmony_integrate
1822
pp.mnn_correct
1923
pp.scanorama_integrate
20-
2124
```
2225

2326
## Sample demultiplexing
@@ -38,5 +41,4 @@ Note that the fundamental limitations of imputation are still under [debate](htt
3841
:toctree: ../generated/
3942
4043
pp.magic
41-
4244
```

docs/references.bib

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -657,6 +657,16 @@ @article{Ntranos2019
657657
pages = {163--166},
658658
}
659659

660+
@article{Patikas2026,
661+
author = {Patikas, Nikolaos and Yao, Hongcheng and Madhu, Roopa and Raychaudhuri, Soumya and Hemberg, Martin and Korsunsky, Ilya},
662+
title = {Integration of large, complex single-cell datasets with Harmony2},
663+
url = {https://doi.org/10.1101/2026.03.16.711825},
664+
doi = {10.1101/2026.03.16.711825},
665+
journal = {bioRxiv},
666+
publisher = {Cold Spring Harbor Laboratory},
667+
year = {2026},
668+
}
669+
660670
@article{Paul2015,
661671
author = {Paul, Franziska and Arkin, Ya’ara and Giladi, Amir and Jaitin, Diego Adhemar and Kenigsberg, Ephraim and Keren-Shaul, Hadas and Winter, Deborah and Lara-Astiaso, David and Gury, Meital and Weiner, Assaf and David, Eyal and Cohen, Nadav and Lauridsen, Felicia Kathrine Bratt and Haas, Simon and Schlitzer, Andreas and Mildner, Alexander and Ginhoux, Florent and Jung, Steffen and Trumpp, Andreas and Porse, Bo Torben and Tanay, Amos and Amit, Ido},
662672
title = {Transcriptional Heterogeneity and Lineage Commitment in Myeloid Progenitors},

docs/release-notes/1.10.0.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Some highlights:
2525
* {func}`scanpy.datasets.blobs` now accepts a `random_state` argument {pr}`2683` {smaller}`E Roellin`
2626
* {func}`scanpy.pp.pca` and {func}`scanpy.pp.regress_out` now accept a layer argument {pr}`2588` {smaller}`S Dicks`
2727
* {func}`scanpy.pp.subsample` with `copy=True` can now be called in backed mode {pr}`2624` {smaller}`E Roellin`
28-
* {func}`scanpy.external.pp.harmony_integrate` now runs with 64 bit floats improving reproducibility {pr}`2655` {smaller}`S Dicks`
28+
* {func}`scanpy.pp.harmony_integrate` now runs with 64 bit floats improving reproducibility {pr}`2655` {smaller}`S Dicks`
2929
* {func}`scanpy.tl.rank_genes_groups` no longer warns that it's default was changed from t-test_overestim_var to t-test {pr}`2798` {smaller}`L Heumos`
3030
* `scanpy.pp.calculate_qc_metrics` now allows `qc_vars` to be passed as a string {pr}`2859` {smaller}`N Teyssier`
3131
* {func}`scanpy.tl.leiden` and {func}`scanpy.tl.louvain` now store clustering parameters in the key provided by the `key_added` parameter instead of always writing to (or overwriting) a default key {pr}`2864` {smaller}`J Fan`

docs/release-notes/1.11.0.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Release candidates:
3030

3131
#### Documentation
3232

33-
- {guilabel}`rc1` Improve {func}`~scanpy.external.pp.harmony_integrate` docs {smaller}`D Kühl` ({pr}`3362`)
33+
- {guilabel}`rc1` Improve {func}`~scanpy.pp.harmony_integrate` docs {smaller}`D Kühl` ({pr}`3362`)
3434
- {guilabel}`rc1` Raise {exc}`FutureWarning` when calling deprecated {mod}`scanpy.pp` functions {smaller}`P Angerer` ({pr}`3380`)
3535
- {guilabel}`rc1` {smaller}`P Angerer` ({pr}`3407`)
3636

docs/release-notes/3953.feat.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Add {func}`scanpy.pp.harmony_integrate` with Harmony1 and Harmony2 support for batch correction {smaller}`S Dicks, P Angerer`

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,6 @@ bbknn = [ "bbknn" ]
8888
dask = [ "anndata[dask]", "dask[array]>=2024.5.1" ]
8989
# PCA acceleration
9090
dask-ml = [ "dask-ml", "scanpy[dask]" ]
91-
harmony = [ "harmonypy" ]
9291
leiden = [ "igraph>=0.10.8", "leidenalg>=0.10.1" ]
9392
louvain = [ "igraph", "louvain>=0.8.2", "setuptools" ]
9493
magic = [ "magic-impute>=2.0.4" ]
@@ -137,6 +136,7 @@ doc = [
137136
]
138137
test-min = [
139138
"dependency-groups", # for CI scripts doctests
139+
"pooch",
140140
"pytest",
141141
"pytest-cov", # only for use from VS Code
142142
"pytest-mock",

src/scanpy/external/pp/__init__.py

Lines changed: 21 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3,24 +3,37 @@
33
from __future__ import annotations
44

55
from ..._compat import deprecated
6-
from ...preprocessing import _scrublet
76
from ._bbknn import bbknn
8-
from ._harmony_integrate import harmony_integrate
97
from ._hashsolo import hashsolo
108
from ._magic import magic
119
from ._mnn_correct import mnn_correct
1210
from ._scanorama_integrate import scanorama_integrate
1311

14-
scrublet = deprecated("Import from sc.pp instead")(_scrublet.scrublet)
15-
scrublet_simulate_doublets = deprecated("Import from sc.pp instead")(
16-
_scrublet.scrublet_simulate_doublets
17-
)
18-
1912
__all__ = [
2013
"bbknn",
21-
"harmony_integrate",
2214
"hashsolo",
2315
"magic",
2416
"mnn_correct",
2517
"scanorama_integrate",
2618
]
19+
20+
21+
@deprecated("Import from sc.pp instead")
22+
def harmony_integrate(*args, **kwargs):
23+
from ...preprocessing import harmony_integrate
24+
25+
return harmony_integrate(*args, **kwargs)
26+
27+
28+
@deprecated("Import from sc.pp instead")
29+
def scrublet(*args, **kwargs):
30+
from ...preprocessing import scrublet
31+
32+
return scrublet(*args, **kwargs)
33+
34+
35+
@deprecated("Import from sc.pp instead")
36+
def scrublet_simulate_doublets(*args, **kwargs):
37+
from ...preprocessing import scrublet_simulate_doublets
38+
39+
return scrublet_simulate_doublets(*args, **kwargs)

src/scanpy/external/pp/_harmony_integrate.py

Lines changed: 0 additions & 98 deletions
This file was deleted.

0 commit comments

Comments
 (0)