Dear CIBERSORTx team,
I'm working with TCGA bulk RNA-seq data (TPM values) from both normal and tumour tissues, and running CIBERSORTx using the LM22 signature matrix with absolute mode and batch correction enabled.
However, I'm getting unexpected results:
Most cell types are returned as 0 or near 0 across all samples.
The P-values are mostly high (median ~0.56)
The correlation values are close to zero or even negative.
Here’s the summary of my results:
r
summary(cibersort_results$P.value)
Min. 0.13 | Median: 0.56 | Max: 0.91
summary(cibersort_results$Correlation)
Min: -0.036 | Median: -0.001 | Max: 0.07
What I’ve Tried:
I’m using TPM expression values downloaded from TCGA (via GDC).
I have verified that the gene symbols are in HGNC format, without version numbers.
I made sure the mixture file and signature matrix were formatted correctly and aligned. Found that only 345 genes are common between the two.
Still, the output is not biologically meaningful — immune cells expected in tumour microenvironments are not detected, and the correlations remain very low.
My Questions:
Is there a known issue when using TPM data from TCGA with CIBERSORTx or LM22?
Is there a better preprocessing approach I should try?
Could this be due to tissue type incompatibility with LM22 (i.e., not blood-derived)?
Or is this likely an issue with the input matrix quality?
I would really appreciate any advice or help in resolving this. I can provide example input files or logs if that would be helpful.
Thanks a lot in advance!
Best regards,
Pumla
Dear CIBERSORTx team,
I'm working with TCGA bulk RNA-seq data (TPM values) from both normal and tumour tissues, and running CIBERSORTx using the LM22 signature matrix with absolute mode and batch correction enabled.
However, I'm getting unexpected results:
Most cell types are returned as 0 or near 0 across all samples.
The P-values are mostly high (median ~0.56)
The correlation values are close to zero or even negative.
Here’s the summary of my results:
r
summary(cibersort_results$P.value)
Min. 0.13 | Median: 0.56 | Max: 0.91
summary(cibersort_results$Correlation)
Min: -0.036 | Median: -0.001 | Max: 0.07
What I’ve Tried:
I’m using TPM expression values downloaded from TCGA (via GDC).
I have verified that the gene symbols are in HGNC format, without version numbers.
I made sure the mixture file and signature matrix were formatted correctly and aligned. Found that only 345 genes are common between the two.
Still, the output is not biologically meaningful — immune cells expected in tumour microenvironments are not detected, and the correlations remain very low.
My Questions:
Is there a known issue when using TPM data from TCGA with CIBERSORTx or LM22?
Is there a better preprocessing approach I should try?
Could this be due to tissue type incompatibility with LM22 (i.e., not blood-derived)?
Or is this likely an issue with the input matrix quality?
I would really appreciate any advice or help in resolving this. I can provide example input files or logs if that would be helpful.
Thanks a lot in advance!
Best regards,
Pumla