Skip to content

Confusing Clone Definition #644

@yuyuleung

Description

@yuyuleung

Report

Hello,

I have tried running Scirpy with the following code to define BCR clonotypes using my BCR data:
(Rules: Same V gene, same J gene, and 85% sequence similarity at the nucleotide level of the junction region)

ir.pp.ir_dist(mdata, metric="normalized_hamming", cutoff=15, sequence="nt", histogram=False) ir.tl.define_clonotype_clusters( mdata, sequence="nt", metric="normalized_hamming", receptor_arms=sirpy_receptor_arms, dual_ir="all", same_v_gene=True, same_j_gene=True, partitions="fastgreedy", key_added="clone_id_85_similarity", )

However, I obtained a clonotype containing sequences with different junction lengths (both amino acid and nucleotide) and also different V genes.
Image

Here is the AIRR file of the incorrectly assigned clonotype (clone 24):
test.airr.tsv

I have also tried only on this subset of airr.

It is confusing that contigs with different V genes were not compared (as I verified in the distance matrix—the junctions of index 0 and 1 were not compared to those of index 2 and 3), yet they were assigned the same clone_id.
Image
Image

Thank you for your help in debugging this issue.
Yuyu

Versions

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions