Skip to content

Low accuracy on simulated reads and overlapping primary alignments #115

@xchang1

Description

@xchang1

Hello,

I am trying to compare the performance of different mappers on long reads and minigraph has unexpectedly low accuracy. I'm mapping 1 million simulated HiFi and R10 reads to the HPRC v1.0 chm13 minigraph graph with minigraph. Minigraph is about 1% less accurate than graphaligner on the same graph, minimap2 on chm13, as well as other mappers on the minigraph-cactus graph. I expected minigraph to perform at least as well as minimap2. Do you see anything wrong about the way I processed the graph or ran minigraph?

In order to get the output gaf to work with the vg tools, I edited the gfa:
sed 's/chr([0-9]*|X|Y|M)/CHM13#0#chr\1/g' to change the reference names and
sed 's/\ts([0-9]*)\t/\t\1\t/g' to take the s out of the segment names.
I then ran minigraph with:
minigraph --vc -N 0 -cx lr -t {threads} {input.gfa} {input.fastq} >{output.gaf}

This may be an unrelated issue, but I also noticed that minigraph produces multiple primary alignments that sometimes overlap in the read or the graph. I attached an example of such a read. S1_19235.gaf.txt As far as I understand it, these cannot be chimeric alignments because some of them overlap in the read or the graph. Should some of them be considered secondary alignments?

Thanks!
Xian

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions