Validity of correlation between different RNA modification calls

Hello. I was reviewing some of my recent RNA004 direct RNA seq data to see what modifications I could call from them. I have six replicates here, and I called m6A, m5C, and pseudouridine for all of them. Then I ran modkit to generate a bedgraph for each, and compared them in IGV. I also remove samples from my bedgraph that have less than 10 total reads, to remove noise.

So for m6A, it would be:
> dorado basecaller sup,m5C,inosine_m6A,pseU ./pod5/ -r --min-qscore 8 --device cuda:0 --reference reference.fasta | samtools sort > sample.bam
> modkit pileup sample.bam --reference reference.fasta -t 18 --region REGION --max-depth 1000000 --bedgraph ./ --filter-theshold 0.9 --motif DRACH 2
> awk '$5>9' sample_a_DRACH2_positive.bedgraph > sample_m6A.bedgraph

and for pseudouridine, it would be:
> modkit pileup sample.bam --reference reference.fasta -t 18 --region REGION --max-depth 1000000 --bedgraph ./ --filter-theshold 0.9 --motif T 0
> awk '$5>9' sample_17802_T0_positive.bedgraph > sample_pseudouridine.bedgraph

This works well, but I noticed that on a couple of my DRACH motif sites, I see a pattern like this:

![Screenshot from 2024-10-14 17-31-47](https://github.com/user-attachments/assets/8eda8610-0140-441c-a824-7b027aa2e19b)

I've marked in black three DRACH sites at the top. This is followed by m6A calls in all 6 replicates, then m5C calls in all 6 replicates, and then pseudouridine calls in all 6 replicates.

In two of these three shown DRACH sites, the A, C, and U of the motif are being called with fairly high confidence as m6A, m5C, and pseudouridine, all in a row. This doesn't occur at every high-probability m6A DRACH site, but it occurs a few of them.

I guess what I'm wondering is if this is truly three modifications, right next to each other, or if the basecaller is detecting a current resistance difference in this area, and each of the individual modification-finding algorithms is believing that this modification is the one it is checking for, if that makes sense?

If it's the latter, is there something I can change in dorado or modkit to screen for this? I'd like to get as accurate a modification profile as possible.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validity of correlation between different RNA modification calls #278

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Validity of correlation between different RNA modification calls #278

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions