Expression reference dataset giving high noise and tumor proportion

Hello,

I'm running numbat on bone marrow samples from subjects with a heme malignancy. I am using a public dataset of the same tissue and technology types as the normal reference expression, and I get a warning of high MSE, and the results give very high proportion of tumor cells - way more than we would suspect based on orthogonal measurements.

I'm wondering if anyone has ideas on how to improve the results using this reference expression set. I tried providing the annotated cell types and cluster id's as groups, and both give similar response. My understanding is that numbat takes raw counts, and there shouldn't be a need for any normalization, but I'm wondering if there is some pre-processing that can be done so that my samples map better to this reference set (using harmony outside of numbat, my samples indeed map well to this public dataset).

Could it be that the fact that these are bone marrow cells, with a range of cell types that is making this happen?

I appreciate any advice or thoughts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expression reference dataset giving high noise and tumor proportion #224

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Expression reference dataset giving high noise and tumor proportion #224

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions