Hi MiXCR developers,
I noticed that when processing the same biological sample sequenced on two different platforms, the FinalClonotypes count (number of distinct clonotypes) differs significantly between the two results, even though the processing pipeline and parameters are the same.
This leads to concerns about how platform-specific differences affect MiXCR results, especially regarding UMI handling and clone assembly.
Questions / Clarifications Needed:
- Could MiXCR internals explain this difference?
- Since I’m using the same tag pattern, library, and features, is there any MiXCR parameter recommended for cross-platform consistency?
- How does MiXCR handle low-frequency UMIs or borderline alignments during clonotype assembly?
- Are there any best practices for assessing whether the difference is biological or technical (e.g., comparing CDR3 spectrum)?
Thanks in advance for your help!
Here is more detailed information
1: Despite using identical pipelines and parameters, the FinalClonotypes count differs significantly between platform 1 and platform
<style>
</style>
| Batch |
Sample |
TotalReads |
AlignedReads |
Overlapped |
FinalClonotypes |
TRA |
TRB |
TRG |
IGH |
IGK |
IGL |
| Platform1 |
JXSCD16_TCR_251228_subseq |
12871959 |
6628998(51.5%) |
3153316(24.5%) |
196902 |
80664 |
107357 |
6182 |
|
|
|
| Platform1 |
JXSCD17_TCR_251228_subseq |
12871959 |
6968575(54.14%) |
3129433(24.31%) |
103627 |
45942 |
54515 |
2638 |
|
|
|
| Platform1 |
JXSCD19_TCR_251228_subseq |
12871959 |
6303977(48.97%) |
3964185(30.8%) |
229681 |
97328 |
125793 |
4939 |
1 |
|
|
| Platform2 |
JXCSD16_TCR_251228_subseq |
12871959 |
6098187(47.38%) |
1721450(13.37%) |
106918 |
51833 |
50417 |
2646 |
|
|
|
| Platform2 |
JXCSD17_TCR_251228_subseq |
12871959 |
4995587(38.81%) |
2617537(20.34%) |
57192 |
30383 |
25365 |
1077 |
|
|
|
| Platform2 |
JXCSD19_TCR_251228_subseq |
12871959 |
5316726(41.3%) |
2563480(19.92%) |
103200 |
54750 |
45117 |
2193 |
|
|
D |
2: I used the following MiXCR pipeline for both datasets:
Align with UMI tag pattern
mixcr align -t 8 --species hsa --assemble-clonotypes-by CDR3 --library imgt.202312-3.sv8 --rna -f -p generic-amplicon-with-umi --tag-pattern "^N{0:2}aagcagtggtatcaacgcagagt(UMI:N{14})tcttgggg(R1:*) \ ^(R2:*) || ^(R1:*) ^N{0:2}aagcagtggtatcaacgcagagt(UMI:N{14})tcttgggg(R2:*)" --floating-left-alignment-boundary --floating-right-alignment-boundary c --report ${sample}.align.report.txt --json-report ${sample}.align.report.json ${fq1} ${fq2} ${sample}.vdjca
RefineTagsAndSort
mixcr -Xmx40g refineTagsAndSort --report ${sample}.refineTags.report.txt ${sample}.vdjca ${sample}.refined.vdjca
Assemble clonotypes
mixcr assemble --assemble-clonotypes-by CDR3 -OassemblingFeatures="CDR3" -OseparateByV=true -OseparateByJ=true -OseparateByC=true --report ${sample}.assemble.report.txt ${sample}.refined.vdjca ${sample}.clns
Export clones
mixcr exportClones ${sample}.clns ${sample}.clones.tsv
Hi MiXCR developers,
I noticed that when processing the same biological sample sequenced on two different platforms, the FinalClonotypes count (number of distinct clonotypes) differs significantly between the two results, even though the processing pipeline and parameters are the same.
This leads to concerns about how platform-specific differences affect MiXCR results, especially regarding UMI handling and clone assembly.
Questions / Clarifications Needed:
Thanks in advance for your help!
Here is more detailed information
1: Despite using identical pipelines and parameters, the FinalClonotypes count differs significantly between platform 1 and platform
<style> </style>2: I used the following MiXCR pipeline for both datasets:
Align with UMI tag pattern
mixcr align -t 8 --species hsa --assemble-clonotypes-by CDR3 --library imgt.202312-3.sv8 --rna -f -p generic-amplicon-with-umi --tag-pattern "^N{0:2}aagcagtggtatcaacgcagagt(UMI:N{14})tcttgggg(R1:*) \ ^(R2:*) || ^(R1:*) ^N{0:2}aagcagtggtatcaacgcagagt(UMI:N{14})tcttgggg(R2:*)" --floating-left-alignment-boundary --floating-right-alignment-boundary c --report ${sample}.align.report.txt --json-report ${sample}.align.report.json ${fq1} ${fq2} ${sample}.vdjcaRefineTagsAndSort
mixcr -Xmx40g refineTagsAndSort --report ${sample}.refineTags.report.txt ${sample}.vdjca ${sample}.refined.vdjcaAssemble clonotypes
mixcr assemble --assemble-clonotypes-by CDR3 -OassemblingFeatures="CDR3" -OseparateByV=true -OseparateByJ=true -OseparateByC=true --report ${sample}.assemble.report.txt ${sample}.refined.vdjca ${sample}.clnsExport clones
mixcr exportClones ${sample}.clns ${sample}.clones.tsv