Low accuracy on simulated reads and overlapping primary alignments

Hello,

I am trying to compare the performance of different mappers on long reads and minigraph has unexpectedly low accuracy. I'm mapping 1 million simulated HiFi and R10 reads to the [HPRC](https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=pangenomes/freeze/freeze1/minigraph/) v1.0 chm13 minigraph graph with minigraph. Minigraph is about 1% less accurate than graphaligner on the same graph, minimap2 on chm13, as well as other mappers on the minigraph-cactus graph. I expected minigraph to perform at least as well as minimap2. Do you see anything wrong about the way I processed the graph or ran minigraph?

In order to get the output gaf to work with the vg tools, I edited the gfa:
`sed  's/chr([0-9]*|X|Y|M)/CHM13#0#chr\1/g'` to change the reference names and
`sed 's/\ts([0-9]*)\t/\t\1\t/g'` to take the `s` out of the segment names.
I then ran minigraph with:
`minigraph --vc -N 0 -cx lr -t {threads} {input.gfa} {input.fastq} >{output.gaf}`

This may be an unrelated issue, but I also noticed that minigraph produces multiple primary alignments that sometimes overlap in the read or the graph. I attached an example of such a read. [S1_19235.gaf.txt](https://github.com/user-attachments/files/17361415/S1_19235.gaf.txt) As far as I understand it, these cannot be chimeric alignments because some of them overlap in the read or the graph. Should some of them be considered secondary alignments? 

Thanks!
Xian

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Low accuracy on simulated reads and overlapping primary alignments #115

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Low accuracy on simulated reads and overlapping primary alignments #115

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions