Skip to content

How can I recover sequences from collapsed homologous regions? #38

@Axolotl233

Description

@Axolotl233

Hello,

Thank you for devoloping such good tool for polyploid genome assembly. Now I am assembling a tetraploid plant genome (AABB) with two closely related subgenomes using HiFi and Hi-C data (without ONT). My goal is to obtain two complete haploid genomes (A1B1 and A2B2). Hifiasm works very well overall, but I noticed that in the unitigs some sequences are incomplete.

For example, sequences that should normally exist in four copies [A1, A2, B1, B2] end up with only two copies [A1, B1] or three copies [A1, A2, B1] in the final assembly results [p_utg], while the missing copies/collapsed regions seems to be chimeric into these sequences (please check figure I uploaded). This genome landscape also reported in new assembled sweetpotato genome, please check (Fig. 2c in https://doi.org/10.1038/s41477-025-02079-6)

Image

I found that #21 provides an example of collapse, but I am not sure whether it applies to my case. Meanwhile, I am also working on a genome with whole-chromosome collapses, and according to the user manual, the collapse subprogram is not suitable for handling entire chromosome collapses.

Thank you for your patience. My question is whether Cphasing could help improve genome assembly in these two situations. In addition, could you provide some suggestions on how to recover sequences from a completely collapsed chromosome?

Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions