Skip to content

Parabrics BQSR runs without .tbi indexes (known_sites_indels_tbi not passed to PARABRICKS_FQ2BAM) #2193

Description

@yambirj

Description of the bug

I am getting the .log warning when running joint germline variant calling with sarek v.3.8.1 with --aligner parabricks

Here is an example:

[PB Warning ][src/PBLocalFile.cpp:54] Could not find index file  .../nxf.hOmiwgq0sp/dbsnp_146.hg38.vcf.gz.tbi
  [PB Warning ][src/PBLocalFile.cpp:54] Could not find index file  .../nxf.hOmiwgq0sp/dbsnp_146.hg38.vcf.tbi
  [PB Warning ][src/PBLocalFile.cpp:54] Could not find index file  .../nxf.hOmiwgq0sp/dbsnp_146.hg38.vcf.gz.csi
  [PB Warning ][src/PBLocalFile.cpp:54] Could not find index file  .../nxf.hOmiwgq0sp/dbsnp_146.hg38.vcf.csi

  [PB Warning  ][src/PBLocalFile.cpp:54] Could not find index file  .../nxf.hOmiwgq0sp/Homo_sapiens_assembly38.known_indels.vcf.gz.tbi
  [PB Warning  ][src/PBLocalFile.cpp:54] Could not find index file  .../nxf.hOmiwgq0sp/Homo_sapiens_assembly38.known_indels.vcf.tbi
  [PB Warning  ][src/PBLocalFile.cpp:54] Could not find index file  .../nxf.hOmiwgq0sp/Homo_sapiens_assembly38.known_indels.vcf.gz.csi
  [PB Warning  ][src/PBLocalFile.cpp:54] Could not find index file  .../nxf.hOmiwgq0sp/Homo_sapiens_assembly38.known_indels.vcf.csi
  
  [PB Warning  ][src/PBLocalFile.cpp:54] Could not find index file  .../nxf.hOmiwgq0sp/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz.tbi
  [PB Warning  ][src/PBLocalFile.cpp:54] Could not find index file  .../nxf.hOmiwgq0sp/Mills_and_1000G_gold_standard.indels.hg38.vcf.tbi
  [PB Warning  ][src/PBLocalFile.cpp:54] Could not find index file  .../nxf.hOmiwgq0sp/Mills_and_1000G_gold_standard.indels.hg38.vcf.gz.csi
  [PB Warning  ][src/PBLocalFile.cpp:54] Could not find index file  .../nxf.hOmiwgq0sp/Mills_and_1000G_gold_standard.indels.hg38.vcf.csi

As a result, parabrics loads full VCFs into RAM --> memory grows with no alignment progress (progressMeter stuck at 0.0%) --> process is killed with an exit status 255.

iGenomes .tbi files exist and are correctly used by the GATK path, but not the parabrics
Evidence:

grep -n -B1 -A10 'FASTQ_PREPROCESS_PARABRICKS(\\|FASTQ_PREPROCESS_GATK(\\|BAM_BASERECALIBRATOR(' \\
> $SAREK/workflows/sarek/main.nf
213- // PREPROCESSING WITH PARABRICKS
214: FASTQ_PREPROCESS_PARABRICKS(
215- input_fastq,
216- fasta,
217- index_alignment,
218- intervals_bed_combined,
219- known_sites_indels,
220- channel.value("cram"),
221- )
222-
223- // Gather preprocessing output
224- cram_variant_calling = channel.empty()
--
232- // PREPROCESSING
233: FASTQ_PREPROCESS_GATK(
234- input_fastq,
235- input_sample,
236- dict,
237- fasta,
238- fasta_fai,
239- index_alignment,
240- intervals_and_num_intervals,
241- intervals_for_preprocessing,
242- known_sites_indels,
243- known_sites_indels_tbi,

Command used and terminal output

nextflow run nf-core/sarek -r 3.8.1 \
    --input "$input_file" \
    --aligner parabricks \
    -profile singularity \
    --genome GATK.GRCh38 \
    -with-report \
    --trim_fastq \
    --three_prime_clip_r1 2 \
    --three_prime_clip_r2 2 \
    --clip_r1 2 \
    --clip_r2 2 \
    -c "$config" \
    --intervals "$intervals" \
    --tools 'haplotypecaller,manta,cnvkit' \
    --outdir "${dir}/sarek_output" \
    --igenomes_base <iGenomes> \
    --igenomes_ignore=false

Relevant files

No response

System information

nf-core/sarek version: 3.8.1

Container: Singularity

Parabricks: 4.6.0-1 (nf-core/modules)

Nextflow: 25.10.2

Hardware: HPC, L40S/A100 GPUs

Executor: PBS

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions