-
Notifications
You must be signed in to change notification settings - Fork 977
Description
Motivation
There is a common situation for us that leads to incorrect factor leveling. We routinely read in aggregated outputs from cellranger, which adds on -1, -2, etc. endings for each sample. CreateSeuratObject() conveniently has names.delim and names.field arguments we can use to get the endings as the orig.ident. However, when there are 10+ samples, the factor leveling order is incorrect because they were character data when factored:
> aggr_data <- Read10X(data.dir = "results/cellranger/Aggregated/outs/count/filtered_feature_bc_matrix/")
> allcells <- CreateSeuratObject(counts = aggr_data,
+ names.delim = "-",
+ names.field = 2)
> levels(allcells$orig.ident)
[1] "1" "10" "11" "12" "13" "14" "15" "16" "17" "18" "2" "3" "4" "5" "6" "7"
[17] "8" "9"
Feature Description
This causes problems when trying to switch to sample IDs or other metadata that is in the aggregation.csv. Looks like the relevant code in CreateSeuratObject in https://github.com/satijalab/seurat-object/blob/main/R/seurat.R is on lines 1349:1354. Instead of
# Create identity classes
idents <- factor(x = unlist(x = lapply(
X = colnames(x = counts),
FUN = ExtractField,
field = names.field,
delim = names.delim
)))
You could check to see if they can be converted to numeric before factoring with something like:
# Create identity classes
idents <-x = unlist(x = lapply(
X = colnames(x = counts),
FUN = ExtractField,
field = names.field,
delim = names.delim
))
# Change to numeric if possible
if(!any(is.na(suppressWarnings(as.numeric(idents )))) {
idents <- as.numeric(idents)
}
idents <- factor(idents)
I'm not sure how this would work with the next checks on idents so I didn't submit it as a pull request
Alternatives
No response