Open
Description
As i understood, if interested in only 5mC vs C (and not 5hmC), then collapsing is performed by splitting 5hmC probabilities equally between both states:
collapsing 'h', with 'm' and canonical options, half of the probability of 'h' will be added to both 'm' and 'C'.
This would work only assuming that 5hmC calls are equally possible between 5mC and C. However, it is known that 5hmC is more likely to be a FP of 5mC and not canonical C (e.g., see section "The use of more comprehensive negative controls to account for confounding DNA modifications" in https://www.biorxiv.org/content/10.1101/2024.11.19.624260v1.full.pdf)
- Do I understand correctly, that if Pcanonical = 0.45, P5mC=0.35, P5hmC=0.2 - it would be assigned Pcanonical (providing threshold is 0.4) ?
- Is there a smarter way to first merge P5mC + P5hmC and then call based on the probability?
Thanks!