Normalize by One Variable, Integrate by Another: Possible Strategy for Large Datasets? #9732

calafell · 2025-03-07T11:44:21Z

calafell
Mar 7, 2025

Hi everyone,

I have some doubts about the best order of functions in Seurat when integrating a large number of samples. I’ve been exploring the new features like BPcells and Sketch, but given the scale of my dataset (~800 samples), I’m looking for a more efficient integration strategy.

Since sample identity is my main source of variation, I was thinking of the following approach:

Split by sample, normalize the data to remove sample-specific variance.
Join layers after normalization.
Split by study (~30 studies) and proceed with full integration. After this, we will continue with FindVariableFeatures and won't re-normalized the data.

The idea behind this is to reduce the number of layers for integration, correct for sequencing depth differences at the sample level, and better handle samples with low cell counts (e.g., ~120 cells) without having to adjust parameters like k.weight drastically. However, will this affect the integrated object? I'm not sure if NormalizeData is influenced when layers are split by dataset or sample. If it isn't, why does Seurat calculate NormalizeData on all counts together.

One potential issue I see is that underrepresented cell types may not integrate properly. Could this be mitigated by using a Sketch per sample before integration?

Does this approach make sense from a statistical and technical perspective? Are there any potential issues I should be aware of?

Looking forward to hearing your thoughts!

Thanks in advance,
Pep

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalize by One Variable, Integrate by Another: Possible Strategy for Large Datasets? #9732

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Normalize by One Variable, Integrate by Another: Possible Strategy for Large Datasets? #9732

Uh oh!

Uh oh!

calafell Mar 7, 2025

Replies: 0 comments

calafell
Mar 7, 2025