Skip to content

Mismatched UMI of a pair of reads #42

@litun-fkby

Description

@litun-fkby

hello, I meet some mistake with the gencore (Version: 0.17.2):
1 contigs in the bam file:
chr1: 7249 bp

Mismatched UMI of a pair of reads
Left:
0:0, M:0:0 TLEN:0 ID:0
E100030802L1C001R00101395324:umi_CA_NN
TTTTTTTTTTTTTTTTTTTTTATTTTTTTTTTATTATTAATATTATTATTAATTTTAAAATCACAACAAAATACAAAAAACAAAAAAAAAAA
GGGGGHHGGHIGGGGGHGGGIIGGGGGGGGGGIGGGGHGIHGGGGGGHGGGGGGGGIGGGGGGHGGGHIGIGGGGGIIGGHHGGGGHIGGHG
Right:
0:0, M:0:0 TLEN:0 ID:0
E100030802L1C004R03200085905:umi_CG_NN
TTAAAAAATTATAAAAAAAAAATAAAAAAATAAAAAAAATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
ERROR: The UMI of a read pair should be identical, but we got CA_ and CG_
d4c5c8bb77b6f6844967b26514807c8

but, when i find the reads in original bam ,the reads pair seem to be correct with same UMI :
573b71043a38b526bd4de45b9a382a5

it's seem like the gencore consider the two read with different read name to be one pair read.

the command is:
gencore --umi_prefix=umi -s 3 --ref hg19.fasta --quit_after_contig 25 -i input.bam -o output.umi.bam

but when I add the "-d" parameter,there is no mistake , maybe there has some difference,:
gencore --umi_prefix=umi -d 0 -s 3 --ref hg19.fasta --quit_after_contig 25 -i input.bam -o output.umi.bam

Looking forward to your reply!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions