Description
Hi Seurat Team (also tagging @dcollins15 because he helped shepard #6792 a few months ago),
I've started to notice in my own analysis that the results of new leidenbase implementation can be significantly different from that of the reticulate passthrough. This on it's own isn't entirely surprising as some parameters are bound to be slightly different.
However, what I have found in my testing is that the igraph implementation most often aligns much closer to the performance of leiden reticulate passthrough and is considerably faster. I have been testing using BPCells implementation cluster_graph_leiden
(https://bnprks.github.io/BPCells/reference/cluster.html).
In testing on hcabm40K dataset igraph implementation was ~4x faster than current FindClusters
with leidenbase.
I know it was just changed but wondering if it should be considered whether to move to igraph implementation. As a pro it would remove a dependency.
Below is just one example using pbmc3k dataset showing better alignment of igraph with the old leiden passthrough.
I wanted to put this out there to the team to evaluate for yourself and decide whether the switch should be made or not. If the answer is yes then I'm happy to work on PR for it so just let me know.
Best,
Sam
leidenbase (default) vs. igraph
library(tidyverse)
library(Seurat)
library(scCustomize)
pbmc <- pbmc3k.SeuratData::pbmc3k.final
pbmc <- UpdateSeuratObject(pbmc)
pbmc <- NormalizeData(object = pbmc) %>%
FindVariableFeatures() %>%
ScaleData() %>%
RunPCA()
pbmc <- FindNeighbors(pbmc, dims = 1:15)
# leidenbase (current defult)
pbmc <- FindClusters(object = pbmc, resolution = 0.5, algorithm = 4)
pbmc <- RunUMAP(pbmc, dims = 1:15)
# BPCells/igraph
bpcells_leiden <- data.frame(BPCells::cluster_graph_leiden(snn = pbmc@graphs$RNA_snn, resolution = 0.5, seed = 1))
colnames(bpcells_leiden) <- c("bpcells_igraph")
pbmc <- AddMetaData(pbmc, metadata = bpcells_leiden)
# Plot
DimPlot_scCustom(pbmc) +ggtitle("leidenbase")
DimPlot_scCustom(pbmc, group.by = "bpcells_igraph", label = T)
leidenalg pass through
# revert to v5.1.0 where reticulate leiden was the default
install.packages("~/Downloads/Seurat_5.1.0.tar.gz", type = "source", repos = NULL)
library(tidyverse)
library(Seurat)
library(scCustomize)
pbmc <- pbmc3k.SeuratData::pbmc3k.final
pbmc <- UpdateSeuratObject(pbmc)
pbmc <- NormalizeData(object = pbmc) %>%
FindVariableFeatures() %>%
ScaleData() %>%
RunPCA()
pbmc <- FindNeighbors(pbmc, dims = 1:15)
# leidenalg pass through reticulate (current defult)
pbmc <- FindClusters(object = pbmc, resolution = 0.5, algorithm = 4)
pbmc <- RunUMAP(pbmc, dims = 1:15)
# plot
DimPlot_scCustom(pbmc) +ggtitle("leiden_reticulate")