Determining Ml score threshold for quantitative comparison of methylation levels across two datasets at the same loci

Hello!

I am attempting to compare dna adenine methylation across two datasets at the same loci from
r10.4 data we have generated in two experimental conditions. We expect the methylation levels to differ substantially between the two datasets, but we want to determine a decently accurate quantitative estimate of the difference. In our analysis we apply an automatic threshold to determine "true" methylation calls.

I have been told that determining the optimal threshold is not trivial and is highly sensitive to sequencing run quality. I have been recommended to use modkit's auto threshold function. However I am worried that this thresholding may be sensitive to the total signal in the dataset and I would be worried that it would introduce distortions in comparisons across the dataset. I guess we are wondering if it would be more appropriate to threshold data using a fixed threshold or a data-informed threshold (and specifically modkits function) especially if we expect big differences between datasets?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Determining Ml score threshold for quantitative comparison of methylation levels across two datasets at the same loci #372

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Determining Ml score threshold for quantitative comparison of methylation levels across two datasets at the same loci #372

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions