Skip to content

BamParser.Parse() returns null objects in rare instances #41

@RoboStephen

Description

@RoboStephen

Today I used dotnetbio to parse a bam file. In rare instances (for a total of 3 reads out of 1 million), BamParser.Parse() returned a null object rather than a SAMAlignedSequence object.

I stepped through in the debugger and I think these lines, in BamParser.cs, are the cause of my issue:
if (alignedSeq.RefEndPos - 1 < start && alignedSeq.RName!="*")
{
return null;
}

Here I'm reading the whole bam, so start == 0. This is a read which is unmapped, but it still has its Pos set to 1 and RName set because its mate was mapped with Pos==1. For the affected read, alignedSeq.RefEndPos is 0. By subtracting 1 from alignedSeq.RefEndPos we get -1, which is less than start == 0, so we return null.

This one-line change fixes the bug, and I believe it's correct in general - I confirmed the unit tests still pass:
if (alignedSeq.RefEndPos - 1 < start && alignedSeq.CIGAR != "*")
{
return null;
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions