Skip to content

Inconsistent FinalClonotypes between two sequencing platforms on the same sample #2076

@xin-bang

Description

@xin-bang

Hi MiXCR developers,

I noticed that when processing the same biological sample sequenced on two different platforms, the FinalClonotypes count (number of distinct clonotypes) differs significantly between the two results, even though the processing pipeline and parameters are the same.
This leads to concerns about how platform-specific differences affect MiXCR results, especially regarding UMI handling and clone assembly.

Questions / Clarifications Needed:

  1. Could MiXCR internals explain this difference?
  2. Since I’m using the same tag pattern, library, and features, is there any MiXCR parameter recommended for cross-platform consistency?
  3. How does MiXCR handle low-frequency UMIs or borderline alignments during clonotype assembly?
  4. Are there any best practices for assessing whether the difference is biological or technical (e.g., comparing CDR3 spectrum)?

Thanks in advance for your help!

Here is more detailed information

1: Despite using identical pipelines and parameters, the FinalClonotypes count differs significantly between platform 1 and platform

<style> </style>
Batch Sample TotalReads AlignedReads Overlapped FinalClonotypes TRA TRB TRG IGH IGK IGL
Platform1 JXSCD16_TCR_251228_subseq 12871959 6628998(51.5%) 3153316(24.5%) 196902 80664 107357 6182      
Platform1 JXSCD17_TCR_251228_subseq 12871959 6968575(54.14%) 3129433(24.31%) 103627 45942 54515 2638      
Platform1 JXSCD19_TCR_251228_subseq 12871959 6303977(48.97%) 3964185(30.8%) 229681 97328 125793 4939 1    
Platform2 JXCSD16_TCR_251228_subseq 12871959 6098187(47.38%) 1721450(13.37%) 106918 51833 50417 2646      
Platform2 JXCSD17_TCR_251228_subseq 12871959 4995587(38.81%) 2617537(20.34%) 57192 30383 25365 1077      
Platform2 JXCSD19_TCR_251228_subseq 12871959 5316726(41.3%) 2563480(19.92%) 103200 54750 45117 2193     D

2: I used the following MiXCR pipeline for both datasets:

Align with UMI tag pattern
mixcr align -t 8 --species hsa --assemble-clonotypes-by CDR3 --library imgt.202312-3.sv8 --rna -f -p generic-amplicon-with-umi --tag-pattern "^N{0:2}aagcagtggtatcaacgcagagt(UMI:N{14})tcttgggg(R1:*) \ ^(R2:*) || ^(R1:*) ^N{0:2}aagcagtggtatcaacgcagagt(UMI:N{14})tcttgggg(R2:*)" --floating-left-alignment-boundary --floating-right-alignment-boundary c --report ${sample}.align.report.txt --json-report ${sample}.align.report.json ${fq1} ${fq2} ${sample}.vdjca

RefineTagsAndSort
mixcr -Xmx40g refineTagsAndSort --report ${sample}.refineTags.report.txt ${sample}.vdjca ${sample}.refined.vdjca

Assemble clonotypes
mixcr assemble --assemble-clonotypes-by CDR3 -OassemblingFeatures="CDR3" -OseparateByV=true -OseparateByJ=true -OseparateByC=true --report ${sample}.assemble.report.txt ${sample}.refined.vdjca ${sample}.clns

Export clones
mixcr exportClones ${sample}.clns ${sample}.clones.tsv

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions