Skip to content

Issues with leiden? Switch to igraph implementation #9800

Open
@samuel-marsh

Description

@samuel-marsh

Hi Seurat Team (also tagging @dcollins15 because he helped shepard #6792 a few months ago),

I've started to notice in my own analysis that the results of new leidenbase implementation can be significantly different from that of the reticulate passthrough. This on it's own isn't entirely surprising as some parameters are bound to be slightly different.

However, what I have found in my testing is that the igraph implementation most often aligns much closer to the performance of leiden reticulate passthrough and is considerably faster. I have been testing using BPCells implementation cluster_graph_leiden (https://bnprks.github.io/BPCells/reference/cluster.html).

In testing on hcabm40K dataset igraph implementation was ~4x faster than current FindClusters with leidenbase.

I know it was just changed but wondering if it should be considered whether to move to igraph implementation. As a pro it would remove a dependency.

Below is just one example using pbmc3k dataset showing better alignment of igraph with the old leiden passthrough.

I wanted to put this out there to the team to evaluate for yourself and decide whether the switch should be made or not. If the answer is yes then I'm happy to work on PR for it so just let me know.

Best,
Sam

leidenbase (default) vs. igraph

library(tidyverse)
library(Seurat)
library(scCustomize)

pbmc <- pbmc3k.SeuratData::pbmc3k.final

pbmc <- UpdateSeuratObject(pbmc)

pbmc <- NormalizeData(object = pbmc) %>%
  FindVariableFeatures() %>%
  ScaleData() %>%
  RunPCA()

pbmc <- FindNeighbors(pbmc, dims = 1:15)

# leidenbase (current defult)
pbmc <- FindClusters(object = pbmc, resolution = 0.5, algorithm = 4)
pbmc <- RunUMAP(pbmc, dims = 1:15)


# BPCells/igraph
bpcells_leiden <- data.frame(BPCells::cluster_graph_leiden(snn = pbmc@graphs$RNA_snn, resolution = 0.5, seed = 1))

colnames(bpcells_leiden) <- c("bpcells_igraph")

pbmc <- AddMetaData(pbmc, metadata = bpcells_leiden)

# Plot
DimPlot_scCustom(pbmc) +ggtitle("leidenbase")
DimPlot_scCustom(pbmc, group.by = "bpcells_igraph", label = T)

leidenalg pass through

# revert to v5.1.0 where reticulate leiden was the default
install.packages("~/Downloads/Seurat_5.1.0.tar.gz", type = "source", repos = NULL)

library(tidyverse)
library(Seurat)
library(scCustomize)

pbmc <- pbmc3k.SeuratData::pbmc3k.final

pbmc <- UpdateSeuratObject(pbmc)

pbmc <- NormalizeData(object = pbmc) %>%
  FindVariableFeatures() %>%
  ScaleData() %>%
  RunPCA()

pbmc <- FindNeighbors(pbmc, dims = 1:15)

# leidenalg pass through reticulate (current defult)
pbmc <- FindClusters(object = pbmc, resolution = 0.5, algorithm = 4)
pbmc <- RunUMAP(pbmc, dims = 1:15)

# plot
DimPlot_scCustom(pbmc) +ggtitle("leiden_reticulate")

Image

Image

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions