Skip to content

Unexpected CIBERSORTx Output on TCGA RNA-seq Data (No Cell Type Detected, Correlation ~0) #6

@Poomlaaaa

Description

@Poomlaaaa

Dear CIBERSORTx team,

I'm working with TCGA bulk RNA-seq data (TPM values) from both normal and tumour tissues, and running CIBERSORTx using the LM22 signature matrix with absolute mode and batch correction enabled.

However, I'm getting unexpected results:

Most cell types are returned as 0 or near 0 across all samples.

The P-values are mostly high (median ~0.56)

The correlation values are close to zero or even negative.

Here’s the summary of my results:

r
summary(cibersort_results$P.value)

Min. 0.13 | Median: 0.56 | Max: 0.91

summary(cibersort_results$Correlation)

Min: -0.036 | Median: -0.001 | Max: 0.07

What I’ve Tried:
I’m using TPM expression values downloaded from TCGA (via GDC).

I have verified that the gene symbols are in HGNC format, without version numbers.

I made sure the mixture file and signature matrix were formatted correctly and aligned. Found that only 345 genes are common between the two.

Still, the output is not biologically meaningful — immune cells expected in tumour microenvironments are not detected, and the correlations remain very low.

My Questions:
Is there a known issue when using TPM data from TCGA with CIBERSORTx or LM22?

Is there a better preprocessing approach I should try?

Could this be due to tissue type incompatibility with LM22 (i.e., not blood-derived)?

Or is this likely an issue with the input matrix quality?

I would really appreciate any advice or help in resolving this. I can provide example input files or logs if that would be helpful.

Thanks a lot in advance!

Best regards,
Pumla

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions