feat: add harreman for metabolic exchange inference in spatial transcriptomics#3806
feat: add harreman for metabolic exchange inference in spatial transcriptomics#3806oieretxezarreta wants to merge 6 commits into
Conversation
for more information, see https://pre-commit.ci
ori-kron-wis
left a comment
There was a problem hiding this comment.
Main focus of review:
- Logic issues in several functions
- Old legacy syntax in the code
- Uploading data to scverse S3 and use pooch only
- Dead code: many functions, redundant imports
- Not all functions are covered in tests
- Seems we can do better in object-oriented programming. its all a bunch of functions now with no relation to each other. Classes can help a lot to organize it.
There was a problem hiding this comment.
Why do you have a local copy of Hotspot? isnt just using the latest from pip enough? if not, why not update and release a new version of hotspot in pip, so that Harreman and others will use? (major code reuse)
There was a problem hiding this comment.
I dont think we need all hotspot to be added to scvi-tools (it has its own pakcage for that) - just the parts that Harreman needs
| compute_neighbors_from_distances( | ||
| adata, | ||
| distances_obsp_key, | ||
| n_neighbors, |
There was a problem hiding this comment.
compute_neighbors_from_distances called with wrong positional args — knn.py:94
Called as: compute_neighbors_from_distances(adata, distances_obsp_key, n_neighbors, sample_key, verbose)
Defined as: def compute_neighbors_from_distances(adata, distances_obsp_key="distances", sample_key=None, verbose=False)
The function accepts 4 args max but receives 5 → TypeError at runtime. n_neighbors is silently passed as sample_key.
| if adata.uns.get("deconv_data", False): | ||
| if verbose: | ||
| print("Adding intra-spot connections...") | ||
| spot_diameter = adata.uns["spot_diameter"] |
There was a problem hiding this comment.
seems that very strict. so you are saying the adata of a user MUST have an und column that is called spot_diameter?or is that inferred automaticaly during the pre process step? (setup_anndata )
| cell_type_key: str, | ||
| database_varm_key: str, | ||
| sample_key: str | None, | ||
| spot_diameter: int, |
There was a problem hiding this comment.
Because it is mandatory (?) switch the place of it and sample_key which can get a None value by default.
| if len(inds) <= 1: | ||
| continue | ||
| for i, j in itertools.permutations(inds, 2): | ||
| distances[i, j] = spot_diameter / 2 |
There was a problem hiding this comment.
here for example, spot_diameter is not a parameter to the function, not part of any class that hold it and not inferred otherwise. how can this function run?
| if hasattr(out_adata.X, "A1") | ||
| else out_adata.X.sum(axis=1) > 0 | ||
| ) | ||
| out_adata._inplace_subset_obs(nonzero_mask) |
There was a problem hiding this comment.
can break without warning between AnnData versions. Use out_adata = out_adata[nonzero_mask].copy() instead.
| database_varm_key: str | None = None, | ||
| model: str | None = None, | ||
| genes: list | None = None, | ||
| use_metabolic_genes: bool | None = False, |
There was a problem hiding this comment.
why there always bool | None ? The type should be bool, not bool | None, when the default is False.
| @@ -0,0 +1,522 @@ | |||
| import time | |||
| import os | |||
There was a problem hiding this comment.
unused in file. beware of unused imports (this is soemthing pre-commits should fix for. you)
There was a problem hiding this comment.
Test coverage critically sparse
Only 7 tests cover
compute_knn_graph variants and basic autocorrelation. None of these are tested:
- compute_cell_communication / compute_ct_cell_communication
- create_modules, calculate_module_scores, calculate_super_module_scores
- extract_interaction_db / database loading
- datasets.load_* functions
- preprocessing.setup_deconv_adata
if they are not needed remove those functions, otherwise you will need to add tets for them
| "RESOLVI", | ||
| "SCVIVA", | ||
| "CYTOVI", | ||
| "harreman", |
There was a problem hiding this comment.
Please change to at least capital H at the beginning of Harreman
| notebooks/spatial/cell2location_lymph_node_spatial_tutorial | ||
| ``` | ||
|
|
||
| ```{customcard} |
There was a problem hiding this comment.
I saw 4 tutorials in harreman docs, is that the only one you consider to add?
Description
This PR adds Harreman (
scvi.external.harreman), a toolkit for inferringmetabolic exchanges in tissues using spatial transcriptomics data.
Changes
scvi.external.harremanwith submodules:tl(tools): KNN graph, cell communication, gene pairshs(hotspot): local autocorrelation, local correlation, gene modulespp(preprocessing): AnnData setup, interaction database loadingds(datasets): Visium and Slide-seq example datasetspl(plots): visualization utilitiestests/external/harreman/(7/7 passing)docs/tutorials/index_spatial.mdRelated