You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm analyzing single-cell RNA-seq data in Seurat and I want to compare the global similarity between different genotypes/conditions by aggregating cells into pseudo-bulk per condition.
Currently, I aggregate the expression data (pseudo-bulk) for each condition, compute a distance (euclidean or correlation) matrix between these average profiles, and then visualize the result using MDS or a heatmap/clustering.
Here is an example of my code:
# Aggregate expression data per condition
agg <- AggregateExpression(obj, group.by = "condition", return.seurat = TRUE)
mat <- GetAssayData(agg, slot = "data")
mat_t <- t(as.matrix(mat)) # conditions as rows
# Compute distance matrix
dist_mat <- dist(mat_t, method = "euclidean")
# MDS visualization
mds <- cmdscale(dist_mat, k = 2)
plot(
mds,
type = "n",
main = "MDS of conditions (pseudo-bulk)",
xlab = "Dimension 1",
ylab = "Dimension 2"
)
text(mds, labels = rownames(mds), col = "blue", cex = 1.2)
My questions:
Does this approach make sense for evaluating global similarity between conditions in single-cell/pseudo-bulk data?
Are there better or more robust alternatives or best practices in the Seurat community for this kind of assessment?
Is it preferable to use PCA/MDS on principal components, or should I work directly with the aggregated expression matrix (and with which normalization)?
Thanks a lot for any advice, references, or practical examples!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
I'm analyzing single-cell RNA-seq data in Seurat and I want to compare the global similarity between different genotypes/conditions by aggregating cells into pseudo-bulk per condition.
Currently, I aggregate the expression data (pseudo-bulk) for each condition, compute a distance (euclidean or correlation) matrix between these average profiles, and then visualize the result using MDS or a heatmap/clustering.
Here is an example of my code:
My questions:
Thanks a lot for any advice, references, or practical examples!
Beta Was this translation helpful? Give feedback.
All reactions