Inconsistency among *read_assignments.tsv, *.transcript_counts.tsv, and *.gene_counts.tsv

Hi,

Thanks for developing this nice tool. I met one issue that the count is inconsistent among files as list in the title.
My command line is:
./IsoQuant-3.5.0/isoquant.py -d nanopore --stranded forward --fastq fastq1 fastq2 fastq3 --reference human_genome.fa --genedb 
 human_genome.gtf --complete_genedb --output output --prefix human --labels a b c -t 10 --model_construction_strategy default_ont --clean_start --matching_strategy default --splice_correction_strategy default_ont --model_construction_strategy default_ont --transcript_quantification unique_only --gene_quantification unique_only

Take gene ENSG00000290425 (its isoforms include ENST00000652586 and ENST00000617243) as an example. I extract all the information related with ENSG00000290425 from read_assignments.txt file as below.

[ENSG00000290425.read_assignments.txt](https://github.com/user-attachments/files/18876424/ENSG00000290425.read_assignments.txt)

According to *.transcript_counts.tsv, the transcript counts for ENST00000652586 and ENST00000617243 are 36 and 4, respectively. According to *.gene_counts.tsv, the gene count for ENSG00000290425 is 146.
However, I got different values when extract from the read_assighments.txt file as below:
grep ENST00000652586 ENSG00000290425.read_assignments.txt|awk '$6~/uniq/{print}'|wc -l (this results in 19)
grep ENST00000617243 ENSG00000290425.read_assignments.txt|awk '$6~/uniq/{print}'|wc -l (this results in 3)
grep ENSG00000290425 ENSG00000290425.read_assignments.txt|awk '$6~/uniq/{print}'|wc -l (this results in 35)

Very confusing. Actually, I checked 1490 isoforms, of which 1279 isoforms have consistent count, 211 isoforms have inconsistent count.

It would be great if isoquant could provide the exact read ids that corresponds to *.transcript_counts.tsv and *.gene_counts.tsv.

Looking forward to hearing from you. Thanks.

Best,
Aifu





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inconsistency among read_assignments.tsv, .transcript_counts.tsv, and *.gene_counts.tsv #290

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Inconsistency among *read_assignments.tsv, *.transcript_counts.tsv, and *.gene_counts.tsv #290

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Inconsistency among read_assignments.tsv, .transcript_counts.tsv, and *.gene_counts.tsv #290