Skip to content

Replacing igraph with own graph implementation #11

@nsapoval

Description

@nsapoval

Currently KOMB relies on igraph for k-core decomposition. Hence, although all the adjacency information for the graph construction is done in the functions Kgraph::readSAM, Kgraph::getEdgeInfo, and Kgraph::generateGraph we still need to create and initialize an igraph graph object. This step takes a significant amount of time, and the resulting graph data structure requires a large amount of RAM to be stored. This is limiting in the following ways:

  1. Large RAM usage prevents KOMB from being run on extremely large datasets (e.g. the whole human genome data as exemplified by HG002 300x Illumina data from GIAB consortium).
  2. Redundancy of converting already stored information into a different object comes at a large runtime cost. Current KOMB run on HG002 chr11 spends half of all processing time initializing the igraph object.
  3. Currently igraph only implements serial graph construction, and serial k-core decomposition. Recent work showed promising parallelization options for k-core decomposition, and hence it might be useful to implement these algorithms as an alternative for very large graphs.

Thus, it would be a good idea to implement an alternative graph data structure that is optimized for KOMB and can support parallel construction and k-core decomposition.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions