Skip to content

Open issues #13

@yhoogstrate

Description

@yhoogstrate
  • Refactor extract_subnetworks
  • Figure out random result glitch (somewhere there is an unsorted yield or loop)

Algorithmic challenges:

  • Fix alignment
  • Data structure / inserting the Graph
  • Pruning
    • Is double insert size currently possible? - if so, reduce it to the sum - fixed in 9679d29
  • Re-joining splice junctions
    • Currently they are first inserted into the main Graph, extracted after pruning and re-inserted in a second system. It makes sense to put them in a Graph in the first place, because for pruning it is necessary to use a genomic index. Suggested solution: use 2 graphs (one for SJ and one for other edges)
  • Extract subnetworks
    • Rename to something that includes by splice junctions
  • Merge overlapping subnetworks
    • Disable weird rnodes code and see if this can be taken up by next step(s)
  • Filtering / classification
    • Add more parameters, do data mining?
    • Distinguish by pruned-by-SJ and normal pruned: normal pruned and only discordant is a very rich source of information (S027) has 15 disco reads pruned normally
  • Optional: try to align intronic and exonic bp
  • Output:
    • is_circular
    • sort by score
    • genome dist
    • Number of splice junctions
    • Test if merge_subnets also contributes to n_nodes and n_edges
    • disco / split read ratio
  • Use consistent variable for insert size
  • Figure out what happen in S055
    • Add offsets to all discordant_paired_ types
    • Add weight for _paired_s
  • Make real bi-directional edges (should reduce mem as well)
    • Start with three graphs: 1 positive, 1 negative strand and one for SJs (uni-strand)
  • Merge all the HTSeq indextree iterator functions into a few generic ones
  • Classification on discordant reads seem to fail if there is a junction from a starting exon (because transcription effectively starts there) - these are often intronic and it is plausible other sub-graphs to merge them with are present

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions