Running Isoquant in transcripts mode from reads and from bams

Hi, 

Thanks for the active development of Isoquant! 
This is more of a request for documentation than a bug report. 

I'm testing the isoquant (version 3.7.1) feature to build models from trusted transcripts. 
My goal is to understand:

- Weather is better to provide fasta of the transcripts or pre-map them
- Whether I should add a synthetic polyA to my transcripts


My workflow is the following. 
This is not a real use case but more of a control. 
1. I build models from ont reads using isoquant in `ont` mode
2. I extract the fasta sequence of the transcripts from the gtf
3. I add an artificial polyA to each transcript sequence (I also test with non-polyadenilated, named simple below)
At this point, I have 11332 fasta sequences that correspond to transcripts. 

At this point, my workflow branches into mapping on my own (a) and letting IsoQuant do the job (b). 
1a. I map the sequencies (simple and polyA) with `minimap2 -ax splice -t 16 "$REF" "$READS" > "$SAM"`
2a. I run Isoquant from these mappings `isoquant.py --reference "$REF" --bam "$BAM[simple,polyA]" --data_type transcripts`
or 
1b. I run Isoquant from reads `isoquant.py --reference "$REF" --fastq "$FASTA[simple,polyA]" --data_type transcripts`

In the slide below, you can see the results (where I'm also testing `ont` and `pacbio` presets). 

<img width="1057" height="551" alt="Image" src="https://github.com/user-attachments/assets/53c5563e-ed36-4168-92a0-4e22696e2522" />

In brief, if I provide BAM there is no difference between polyA or simple. 
If I provide sequences of transcripts I do see a difference in the models 
In all cases the transcripts that I get are less than the input ones. 

My questions are:
- How and at what stage isoquant handle polyA?
- Why don't I get the same number of transcripts as the input? Especially in the case of bam as input?

Thanks again for opening and maintaining Isoquant. 

Fabio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Running Isoquant in transcripts mode from reads and from bams #348

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Running Isoquant in transcripts mode from reads and from bams #348

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions