Description of the bug
The pipeline stops with the error message: A USER ERROR has occurred: Contig chr1_KI270706v1_random not present in reads sequence dictionary
I have downloaded the relevant igenomes data locally:
$ ls /data/ref/igenomes/Homo_sapiens/GATK/GRCh38/Sequence/WholeGenomeFasta/Homo_sapiens_assembly38.fasta /data/ref/igenomes/Homo_sapiens/GATK/GRCh38/Sequence/WholeGenomeFasta/Homo_sapiens_assembly38.fasta
My test.csv file contains:
$ cat test.csv sample,bam,bai SLPD001,/datf/sl/users/ivana/BTB/SLPD001/wgs-b/151013_ST-E00269_0029_BHGC5YCCXX/P2233_137_S9_L005_R1_001.bam,/datf/sl/users/ivana/BTB/SLPD001/wgs-b/151013_ST-E00269_0029_BHGC5YCCXX/P2233_137_S9_L005_R1_001.bam.bai
I made sure that the offending contig is absent from the .bam file. (When present, in my first attempt, the same error resulted.)
Command used and terminal output
$nextflow run nf-core/createpanelrefs -r dev -profile singularity --input test.csv --tools germlinecnvcaller --genome GATK.GRCh38 --igenomes_base /data/ref/igenomes/ --outdir test_out
.......
Execution cancelled -- Finishing pending tasks before exit
-[nf-core/createpanelrefs] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_CREATEPANELREFS:CREATEPANELREFS:GERMLINECNVCALLER_COHORT:GATK4_COLLECTREADCOUNTS (SLPD001)'
Caused by:
Process `NFCORE_CREATEPANELREFS:CREATEPANELREFS:GERMLINECNVCALLER_COHORT:GATK4_COLLECTREADCOUNTS (SLPD001)` terminated with an error exit status (2)
Command executed:
gatk --java-options "-Xmx9830M -XX:-UsePerfData" \
CollectReadCounts \
--input P2233_137_S9_L005_R1_001.bam \
--intervals genome.interval_list \
--output SLPD001.hdf5 \
--reference Homo_sapiens_assembly38.fasta \
--tmp-dir . \
--format HDF5 --imr OVERLAPPING_ONLY
cat <<-END_VERSIONS > versions.yml
"NFCORE_CREATEPANELREFS:CREATEPANELREFS:GERMLINECNVCALLER_COHORT:GATK4_COLLECTREADCOUNTS":
gatk4: $(echo $(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*$//')
END_VERSIONS
Command exit status:
2
Command output:
(empty)
Command error:
INFO: Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
INFO: Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
Using GATK jar /opt/conda/share/gatk4-4.6.2.0-0/gatk-package-4.6.2.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx9830M -XX:-UsePerfData -jar /opt/conda/share/gatk4-4.6.2.0-0/gatk-package-4.6.2.0-local.jar CollectReadCounts --input P2233_137_S9_L005_R1_001.bam --intervals genome.interval_list --output SLPD001.hdf5 --reference Homo_sapiens_assembly38.fasta --tmp-dir . --format HDF5 --imr OVERLAPPING_ONLY
08:23:35.823 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/opt/conda/share/gatk4-4.6.2.0-0/gatk-package-4.6.2.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
08:23:36.177 INFO CollectReadCounts - ------------------------------------------------------------
08:23:36.182 INFO CollectReadCounts - The Genome Analysis Toolkit (GATK) v4.6.2.0
08:23:36.182 INFO CollectReadCounts - For support and documentation go to https://software.broadinstitute.org/gatk/
08:23:36.182 INFO CollectReadCounts - Executing as peter@monod33.mbb.ki.se on Linux v4.18.0-553.53.1.el8_10.x86_64 amd64
08:23:36.182 INFO CollectReadCounts - Java runtime: OpenJDK 64-Bit Server VM v17.0.11-internal+0-adhoc..src
08:23:36.183 INFO CollectReadCounts - Start Date/Time: December 10, 2025 at 8:23:35 AM GMT
08:23:36.183 INFO CollectReadCounts - ------------------------------------------------------------
08:23:36.183 INFO CollectReadCounts - ------------------------------------------------------------
08:23:36.184 INFO CollectReadCounts - HTSJDK Version: 4.2.0
08:23:36.184 INFO CollectReadCounts - Picard Version: 3.4.0
08:23:36.184 INFO CollectReadCounts - Built for Spark Version: 3.5.0
08:23:36.187 INFO CollectReadCounts - HTSJDK Defaults.COMPRESSION_LEVEL : 2
08:23:36.187 INFO CollectReadCounts - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
08:23:36.188 INFO CollectReadCounts - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
08:23:36.188 INFO CollectReadCounts - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
08:23:36.188 INFO CollectReadCounts - Deflater: IntelDeflater
08:23:36.188 INFO CollectReadCounts - Inflater: IntelInflater
08:23:36.188 INFO CollectReadCounts - GCS max retries/reopens: 20
08:23:36.188 INFO CollectReadCounts - Requester pays: disabled
08:23:36.189 INFO CollectReadCounts - Initializing engine
08:23:36.889 INFO FeatureManager - Using codec IntervalListCodec to read file file://genome.interval_list
08:23:45.352 INFO IntervalArgumentCollection - Processing 3043969085 bp from intervals
08:23:45.462 INFO CollectReadCounts - Done initializing engine
08:23:45.470 WARN CollectReadCounts - Sequence dictionary in BAM does not match the master sequence dictionary.
08:23:45.471 INFO CollectReadCounts - Collecting read counts...
08:23:45.471 INFO ProgressMeter - Starting traversal
08:23:45.472 INFO ProgressMeter - Current Locus Elapsed Minutes Reads Processed Reads/Minute
08:23:45.566 INFO CollectReadCounts - Shutting down engine
[December 10, 2025 at 8:23:45 AM GMT] org.broadinstitute.hellbender.tools.copynumber.CollectReadCounts done. Elapsed time: 0.17 minutes.
Runtime.totalMemory()=2013265920
***********************************************************************
A USER ERROR has occurred: Contig chr1_KI270706v1_random not present in reads sequence dictionary
Relevant files
nextflow.log
System information
Nextflow version 25.10.0
Almalinux 8
nf-core/createpanelrefs 1.0.0
createpanelrefs [thirsty_varahamihira] DSL2 - revision: ab14ab0 [dev]
Description of the bug
The pipeline stops with the error message: A USER ERROR has occurred: Contig chr1_KI270706v1_random not present in reads sequence dictionary
I have downloaded the relevant igenomes data locally:
$ ls /data/ref/igenomes/Homo_sapiens/GATK/GRCh38/Sequence/WholeGenomeFasta/Homo_sapiens_assembly38.fasta /data/ref/igenomes/Homo_sapiens/GATK/GRCh38/Sequence/WholeGenomeFasta/Homo_sapiens_assembly38.fastaMy test.csv file contains:
$ cat test.csv sample,bam,bai SLPD001,/datf/sl/users/ivana/BTB/SLPD001/wgs-b/151013_ST-E00269_0029_BHGC5YCCXX/P2233_137_S9_L005_R1_001.bam,/datf/sl/users/ivana/BTB/SLPD001/wgs-b/151013_ST-E00269_0029_BHGC5YCCXX/P2233_137_S9_L005_R1_001.bam.baiI made sure that the offending contig is absent from the .bam file. (When present, in my first attempt, the same error resulted.)
Command used and terminal output
Relevant files
nextflow.log
System information
Nextflow version 25.10.0
Almalinux 8
nf-core/createpanelrefs 1.0.0
createpanelrefs [thirsty_varahamihira] DSL2 - revision: ab14ab0 [dev]