Skip to content

Expression reference dataset giving high noise and tumor proportion #224

@katiedaco-diamondage

Description

@katiedaco-diamondage

Hello,

I'm running numbat on bone marrow samples from subjects with a heme malignancy. I am using a public dataset of the same tissue and technology types as the normal reference expression, and I get a warning of high MSE, and the results give very high proportion of tumor cells - way more than we would suspect based on orthogonal measurements.

I'm wondering if anyone has ideas on how to improve the results using this reference expression set. I tried providing the annotated cell types and cluster id's as groups, and both give similar response. My understanding is that numbat takes raw counts, and there shouldn't be a need for any normalization, but I'm wondering if there is some pre-processing that can be done so that my samples map better to this reference set (using harmony outside of numbat, my samples indeed map well to this public dataset).

Could it be that the fact that these are bone marrow cells, with a range of cell types that is making this happen?

I appreciate any advice or thoughts

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions