-
Notifications
You must be signed in to change notification settings - Fork 81
Description
Hi,
We are running MiXCR on some long read TCR enriched data that was generated using ONT but have been getting a few issues. We processed our reads using Pychopper, and then trimmed the TSO/Read1 and barcodes/UMIs that are present at the start of the reads using cutadapt. Given the error rates with ONT sequencing, we wanted to correct the reads which we did using isONcorrect. Without correction, MiXCR runs fine and completes, however using the files that were corrected, we are getting the error. We are correcting the cutadapt trimmed reads, and then using the read IDs, prepending the UMI/barcodes back to the start of the reads so they can be used for refineTagsAndSort in MiXCR
For reference, the structure of the reads post correction and with the prepended sequence is:
^Read1primer_CELLBARCODE{16}_UMI{12}_Read.........
Actual Result
The alignment seems to run fine, but the second step, the refineTagsAndSort step is where it seems to error out. Even if I manually run refineTagsAndSort on the .vdjca file that gets generated in the align step, with --dont-correct set, I still get the same error below:
>>>>>>>>>>>>>>>>>>>>>>> mixcr align <<<<<<<<<<<<<<<<<<<<<<<
Running:
mixcr align -f --report ./OX6_LN_MiXCR_test.align.report.txt --json-report ./OX6_LN_MiXCR_test.align.report.json --threads 30 --preset local:10x-ont-modified-v1 --save-output-file-names ./OX6_LN_MiXCR_test.align.list.tsv --keep-non-CDR3-alignments --rna --species hsa --tag-pattern ^CTACACGACGCTCTTCCGATCT(CELL:N{16})(UMI:N{12})(R1:*) --set-whitelist CELL=file:/well/dong/users/uap089/SPTCRseq/SPTCR-Seq-Pipeline//Reference/Barcodes/visium_bc.tsv --not-aligned-R1 ./OX6_LN_MiXCR_test_not_aligned.fastq.gz /well/dong/users/uap089/SPTCRseq/Experiment2_Oct2025/OX6_LN_MiXCR_test/isOnCorrect_OX6_cutadapt/corrected_with_prefix.fq ./OX6_LN_MiXCR_test.alignments.vdjca
The following tags and their roles will be associated with each output alignment:
Payload tags: R1
Cell tags: CELL(SQ)
Molecule tags: UMI(SQ)
Alignment: 0%
Alignment: 10.1% ETA: 00:08:48
Alignment: 20.1% ETA: 00:07:25
Alignment: 30.2% ETA: 00:06:43
Alignment: 40.5% ETA: 00:05:06
Alignment: 50.6% ETA: 00:03:40
Alignment: 60.6% ETA: 00:02:13
Alignment: 70.6% ETA: 00:01:40
Alignment: 80.8% ETA: 00:01:09
Alignment: 90.8% ETA: 00:00:38
====================== report: align ======================
Analysis time: 11.92m
Total sequencing reads: 7206039
Successfully aligned reads: 1484064 (20.59%)
Coverage (percent of successfully aligned):
CDR3: 1200381 (80.88%)
FR3_TO_FR4: 999850 (67.37%)
CDR2_TO_FR4: 969084 (65.3%)
FR2_TO_FR4: 766117 (51.62%)
CDR1_TO_FR4: 723769 (48.77%)
VDJRegion: 644098 (43.4%)
Alignment failed: no hits (not TCR/IG?): 4564144 (63.34%)
Alignment failed: absent barcode: 1157831 (16.07%)
Overlapped: 0 (0%)
Overlapped and aligned: 0 (0%)
Overlapped and not aligned: 0 (0%)
Alignment-aided overlaps, percent of overlapped and aligned: 0 (NaN%)
No CDR3 parts alignments, percent of successfully aligned: 73657 (4.96%)
Partial aligned reads, percent of successfully aligned: 210026 (14.15%)
Realigned with forced non-floating bound: 0 (0%)
Realigned with forced non-floating right bound in left read: 0 (0%)
Realigned with forced non-floating left bound in right read: 0 (0%)
TRA chains: 1335 (0.09%)
TRA non-functional: 187 (14.01%)
TRB chains: 6451 (0.43%)
TRB non-functional: 1048 (16.25%)
TRD chains: 109 (0.01%)
TRD non-functional: 17 (15.6%)
TRG chains: 9082 (0.61%)
TRG non-functional: 35 (0.39%)
IGH chains: 80515 (5.43%)
IGH non-functional: 20620 (25.61%)
TRAD chains: 1837 (0.12%)
TRAD non-functional: 0 (0%)
IGK chains: 825456 (55.62%)
IGK non-functional: 190345 (23.06%)
IGL chains: 559279 (37.69%)
IGL non-functional: 85573 (15.3%)
Tag parsing report:
Execution time: 0ns
Total reads: 7206039
Matched reads: 6048208 (83.93%)
Projection +R1: 6048208 (83.93%)
For variant 0:
For projection +R1:
CELL:Left position: 22
UMI:Left position: 38
CELL:Right position: 38
R1:Left position: 50
UMI:Right position: 50
Variants: 0
Cost: 0
CELL length: 16
UMI length: 12
R1 length:
26~205: + 910772 (15.06%) = 910772 (15.06%)
206~342: + 908487 (15.02%) = 1819259 (30.08%)
343~454: + 914904 (15.13%) = 2734163 (45.21%)
455~555: + 911746 (15.07%) = 3645909 (60.28%)
556~693: + 908949 (15.03%) = 4554858 (75.31%)
694~32061: + 1493350 (24.69%) = 6048208 (100%)
>>>>>>>>>>>>>>>>> mixcr refineTagsAndSort <<<<<<<<<<<<<<<<<
Running:
mixcr refineTagsAndSort -f --report ./OX6_LN_MiXCR_test.refine.report.txt --json-report ./OX6_LN_MiXCR_test.refine.report.json ./OX6_LN_MiXCR_test.alignments.vdjca ./OX6_LN_MiXCR_test.refined.vdjca
Sorting will be applied to the following tags: CELL, UMI
The following whitelist will be used for CELL: WhitelistFromAddress....(really long output)
Initialization: progress unknown
Initialization: 27.5%
Initialization: 62.8% ETA: 00:00:01
Initialization: 100% ETA: 00:00:00
Writing CELL: 35.4%
Writing CELL: 76.4% ETA: 00:00:00
Processing UMI: 0.6%
Processing UMI: 11.3% ETA: 00:00:41
Processing UMI: 21.3% ETA: 00:00:39
Processing UMI: 32.6% ETA: 00:00:29
Processing UMI: 43.4% ETA: 00:00:26
Processing UMI: 54% ETA: 00:00:21
Processing UMI: 64.8% ETA: 00:00:16
Processing UMI: 75.5% ETA: 00:00:11
Processing UMI: 86.8% ETA: 00:00:05
Processing UMI: 98.7% ETA: 00:00:00
Filtering: progress unknown
Final sorting: 5%
Final sorting: 18% ETA: 00:00:06
Final sorting: 29.2% ETA: 00:00:06
Final sorting: 41.5% ETA: 00:00:04
Final sorting: 53.3% ETA: 00:00:04
Final sorting: 66% ETA: 00:00:02
Final sorting: 76.9% ETA: 00:00:02
Final sorting: 88.5% ETA: 00:00:00
Please copy the following information along with the stacktrace:
Version: 4.7.0; built=Wed Aug 07 20:19:48 BST 2024; rev=976ba14139; lib=repseqio.v5.1
OS: Linux
Java: 17.0.6
Abs path: /gpfs3/well/dong/users/uap089/SPTCRseq/Experiment2_Oct2025/OX6_LN_MiXCR_test/isOnCorrect_OX6_cutadapt/tmp
Cmd args: refineTagsAndSort -f --report ./OX6_LN_MiXCR_test.refine.report.txt --json-report ./OX6_LN_MiXCR_test.refine.report.json ./OX6_LN_MiXCR_test.alignments.vdjca ./OX6_LN_MiXCR_test.refined.vdjca
picocli.CommandLine$ExecutionException: Error while running command refineTagsAndSort com.milaboratory.mixcr.cli.TagCorrectionError: Error on tag correction of []
at com.milaboratory.mixcr.cli.Main.registerExceptionHandlers$lambda-17(SourceFile:420)
at picocli.CommandLine.execute(CommandLine.java:2088)
at com.milaboratory.mixcr.cli.Main.execute(SourceFile:105)
at com.milaboratory.mixcr.cli.CommandAnalyze$Cmd$PlanBuilder.executeSteps(SourceFile:543)
at com.milaboratory.mixcr.cli.CommandAnalyze$Cmd.run0(SourceFile:500)
at com.milaboratory.mixcr.cli.MiXCRCommand.run(SourceFile:37)
at picocli.CommandLine.executeUserObject(CommandLine.java:1939)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2314)
at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
at picocli.CommandLine$RunLast.execute(CommandLine.java:2316)
at com.milaboratory.mixcr.cli.Main.registerLogger$lambda-32(SourceFile:539)
at picocli.CommandLine.execute(CommandLine.java:2078)
at com.milaboratory.mixcr.cli.Main.execute(SourceFile:105)
at com.milaboratory.mixcr.cli.Main.main(SourceFile:101)
Caused by: com.milaboratory.mixcr.cli.TagCorrectionError: Error on tag correction of []
at com.milaboratory.mixcr.cli.CommandRefineTagsAndSort$Cmd.run1(SourceFile:375)
at com.milaboratory.mixcr.cli.MiXCRCommandWithOutputs.run0(SourceFile:69)
at com.milaboratory.mixcr.cli.MiXCRCommand.run(SourceFile:37)
at picocli.CommandLine.executeUserObject(CommandLine.java:1939)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2314)
at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
at picocli.CommandLine$RunLast.execute(CommandLine.java:2316)
at com.milaboratory.mixcr.cli.Main.registerLogger$lambda-32(SourceFile:539)
at picocli.CommandLine.execute(CommandLine.java:2078)
... 15 more
Caused by: java.lang.IllegalStateException: Can't assemble logarithmic histogram for data values less then or equal to zero
at com.milaboratory.mitool.refinement.gfilter.HistKt.collect(SourceFile:105)
at com.milaboratory.mitool.refinement.gfilter.GroupFilter$filter$metricValues$lambda$17$$inlined$doAfterLastOrClose$1.close(SourceFile:299)
at com.milaboratory.mitool.refinement.gfilter.GroupFilter$filter$5$createPort$lambda$11$$inlined$doAfterLastOrClose$1.take(SourceFile:275)
at cc.redberry.pipe.util.CountingOutputPort$Companion$wrap$1.take(CountingOutputPort.kt:35)
at com.milaboratory.o.FA.take(SourceFile:25)
at cc.redberry.pipe.util.Chunk.readChunk(Chunk.java:78)
at cc.redberry.pipe.CUtils$2.take(CUtils.java:169)
at cc.redberry.pipe.CUtils$2.take(CUtils.java:161)
at cc.redberry.pipe.blocks.O2ITransmitter.run(O2ITransmitter.java:63)
at java.base/java.lang.Thread.run(Thread.java:833)
I know what the error literally means, but in the case of what it's doing, I'm not sure what the error means and I can't seem to figure it out
Exact MiXCR commands
INPUT=$outfolder/corrected_with_prefix.fq
$mixcr_ex analyze local:10x-ont-modified-v2
$INPUT
./${SAMPLE_NAME}
--set-whitelist CELL=file:bc.tsv
--species hsa
--threads 30
--keep-non-CDR3-alignments
--not-aligned-R1 ./${SAMPLE_NAME}_not_aligned.fastq.gz
--rna
--tag-pattern "^CTACACGACGCTCTTCCGATCT(CELL:N{16})(UMI:N{12})(R1:*)"
-f
We are using a custom preset that was created for: #1681
Works:
ont-modified-v2.yaml
Doesn't work:
ont-modified-v1.yaml
For it to work with the corrected reads, I had to change the following lines from 'true' to 'false': lines 179, 205, 235. Keeping them as 'true', MiXCR runs for the uncorrected reads. Once they are changed to 'false', I am able to run the command for the corrected reads. Regardless of if I do the correction with or without the barcodes/UMIs trimmed, I need those lines to be 'false' for it to run with the isONcorrected reads.
My issue is I'm not sure what that part of the configuration is doing. I'm just wondering whether I'm fine with just setting those lines as 'false'?
Any help would be much appreciated, and please do let me know if you require any further information!
Thanks