Effect of low counts on cellcharter niches #115

jmintch · 2026-03-25T14:26:47Z

jmintch
Mar 25, 2026

Dear cellcharter team,
first of all thanks a lot for creating this too, so far I had a lot of fun exploring it!

disclaimer: I'm still quite new to dataset training and python

I was just starting off with analyzing my first spatial transcriptomics dataset and when I ran the package with the automated k predict, I obtained 3 niches. One niche seemed like it was correlated with tissue areas that have lower UMI counts, but of which I know that they are biologically unrelated/distinct. Manually increasing the number of k was able to separate those niches (even though the stability was much lower), but I am still unsure if one of the remaining niches is maybe just a "low count" niche even though it might biologically belong to multiple other niches.

I was wondering if the identification of spatial niches can be influenced by the local quality or low UMI counts, and if there is a way to control for this in the analysis? I know that in scRNA-seq analyses there are ways to regress out for the number of UMIs in a cell, but I am not sure if there's a way to translate this to spatial transcriptomics data and specifically the cellcharter pipeline.

I'd be very curious to hear your opinion on it @marcovarrone!

Answered by LiudengZhang

Mar 25, 2026

Low-UMI regions (tissue edges, poor permeabilization) can end up creating artificial niches because the aggregated neighborhood features are uniformly low, and the GMM picks that up as its own cluster.

If you're using scVI in the pipeline, the latent space already accounts for library size to some extent, but it's not always perfect for very low-count cells. Some things that might help: plotting total_counts spatially alongside niche labels to check if they co-localize (which would suggest it's technical rather than biological). Filtering low-quality cells more aggressively with sc.pp.filter_cells() could also help, or passing continuous_covariate_keys=['total_counts'] to scvi.model.SCVI.…

View full answer

LiudengZhang · 2026-03-25T18:59:19Z

LiudengZhang
Mar 25, 2026

Low-UMI regions (tissue edges, poor permeabilization) can end up creating artificial niches because the aggregated neighborhood features are uniformly low, and the GMM picks that up as its own cluster.

If you're using scVI in the pipeline, the latent space already accounts for library size to some extent, but it's not always perfect for very low-count cells. Some things that might help: plotting total_counts spatially alongside niche labels to check if they co-localize (which would suggest it's technical rather than biological). Filtering low-quality cells more aggressively with sc.pp.filter_cells() could also help, or passing continuous_covariate_keys=['total_counts'] to scvi.model.SCVI.setup_anndata() to give the model more signal to factor out depth.

Worth noting that sc.pp.regress_out probably wouldn't help here since CellCharter clusters on the scVI latent space, not on adata.X directly.

0 replies

marcovarrone · 2026-03-25T21:17:39Z

marcovarrone
Mar 25, 2026
Maintainer

Could have not answered better @LiudengZhang ! Thank you!
@jmintch are you using scVI for the embedding? Because as Liudeng said, it should take into account the total counts.
You could try regressing out the total counts and run PCA on that and use those features instead of the scVI ones, but I have to warn that I have always seen PCA underperform compared to using scVI embeddings

0 replies

jmintch · 2026-03-27T09:08:28Z

jmintch
Mar 27, 2026
Author

Thanks for your input, that helped a lot to better understand which approaches are more promising than others.

@marcovarrone yes I was using scVI, I was quite closely following your tutorial on the CosMx dataset. @LiudengZhang thanks a lot for your suggestion of passing continuous_covariate_keys=['total_counts'], I just tried it out and it's working much better, I think. I really appreciate it!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Effect of low counts on cellcharter niches #115

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Effect of low counts on cellcharter niches #115

Uh oh!

jmintch Mar 25, 2026

Replies: 3 comments

Uh oh!

LiudengZhang Mar 25, 2026

Uh oh!

marcovarrone Mar 25, 2026 Maintainer

Uh oh!

jmintch Mar 27, 2026 Author

jmintch
Mar 25, 2026

LiudengZhang
Mar 25, 2026

marcovarrone
Mar 25, 2026
Maintainer

jmintch
Mar 27, 2026
Author