Description
Hi,
Thanks for this handy tool!
I encounter a problem when using adjust-mods, aiming to only keep C+m modifications and ignore A+a.
For a given read, the MM tag is initially:
MM = C+h?,70,26,9,1,0,4,27,29,42,104,2,13,0,112,84,34,52,8,30,48,9,4,15,19,
121,38,42,0,65,73,54,31,2,19,163,168,48,20,125,63,57,20,37,179,33,8,9,
103,14,48,45,5,152,37,69,30,1,154,31,122,9,20,11,6,39,14,22,2;C+m?,70,
26,9,1,0,4,27,29,42,104,2,13,0,112,84,34,52,8,30,48,9,4,15,19,121,38,4
2,0,65,73,54,31,2,19,163,168,48,20,125,63,57,20,37,179,33,8,9,103,14,4
8,45,5,152,37,69,30,1,154,31,122,9,20,11,6,39,14,22,2;A+a.,6,12,2,0,0[...]
When running this command
modkit adjust-mods --ignore a test.bam test_ignorea.bam
, the MM tag becomes
MM = C+h?,70,26,9,1,0,4,27,29,42,104,2,13,0,112,84,34,52,8,30,48,9,4,15,19,
121,38,42,0,65,73,54,31,2,19,163,168,48,20,125,63,57,20,37,179,33,8,9,
103,14,48,45,5,152,37,69,30,1,154,31,122,9,20,11,6,39,14,22,2;C+m?,70,
26,9,1,0,4,27,29,42,104,2,13,0,112,84,34,52,8,30,48,9,4,15,19,121,38,4
2,0,65,73,54,31,2,19,163,168,48,20,125,63,57,20,37,179,33,8,9,103,14,4
8,45,5,152,37,69,30,1,154,31,122,9,20,11,6,39,14,22,2;A+A.;
What's bugging me is this A+A that appears at the end of the MM tag. For other reads (here, for all reads aligning to the top strand), this A+A will appear at the beginning of the MM tag:
MM = A+A.;C+h?,28,72,23,16,62,66,20,4,9,22,33,7,89,35,84,105,1,13,5,111,46,
37,65,0,1,7,2,36,120,7,46,59,9,2,50,10,21,72,57,99,35,6,115,7,16,26,20
,69,51,9,9,7,15,11,63,10,48,71,0,45,12,1,13,86,24,14,1,0,10,6,8,6,0,19
,138,230,39,52,38,7,203,31,94,23,30,12,6,80,25,4,17,34,29,3,51,9,38,39
,91,145,5,66,12,9,1,1,11,16,15,88,0,24,41,1,19,37,23,15,1,8,75,32,59,1
2,19,45,7,8,16,10,0,134,23,32,15,39,16,59,20,53,22,42,15,8,6,6,16,43,6
,41,17,51,19,29,5,12,59,57;C+m?,28,72,23,16,62,66,20,4,9,22,33,7,89,35
,84,105,1,13,5,111,46,37,65,0,1,7,2,36,120,7,46,59,9,2,50,10,21,72,57,
99,35,6,115,7,16,26,20,69,51,9,9,7,15,11,63,10,48,71,0,45,12,1,13,86,2
4,14,1,0,10,6,8,6,0,19,138,230,39,52,38,7,203,31,94,23,30,12,6,80,25,4
,17,34,29,3,51,9,38,39,91,145,5,66,12,9,1,1,11,16,15,88,0,24,41,1,19,3
7,23,15,1,8,75,32,59,12,19,45,7,8,16,10,0,134,23,32,15,39,16,59,20,53,
22,42,15,8,6,6,16,43,6,41,17,51,19,29,5,12,59,57;
I do not know whether this is intended or not, but I've realized it was there because it was messing with some downstream program I'm using (wgbstools). I'm happy to report this if it is a bug, or happy to know the reason it's there if it is intended!