Skip to content

Bash parameters prefixed infront of command_prepare are insufficient, leading to silent errors #26

@alexiswl

Description

@alexiswl

Noticed from some of the errors in #23, bolt is continuing to run commands, despite previous errors.

In the snippet below, it looks like bcftools merge is called and when it returns a Usage snippet, the next line which is bcftools sort is run anyway??

Details
      "2025-12-08T03:45:58.053232052Z 2025-12-08 03:45:58,052 - bolt.util - INFO - Running bcftools merge...",
        "2025-12-08T03:45:58.053474427Z 2025-12-08 03:45:58,053 - bolt.util - INFO - Executing command: bcftools merge \\",
        "2025-12-08T03:45:58.053512058Z           -m all \\",
        "2025-12-08T03:45:58.053534348Z           -Oz \\",
        "2025-12-08T03:45:58.053553688Z           -o output/pcgr/nosampleset.pcgr.grch38.unsorted.vcf.gz \\",
        "2025-12-08T03:45:58.053573359Z           output/pcgr/pcgr_4/nosampleset.pcgr.grch38.pass.vcf.gz",
        "2025-12-08T03:45:58.062109158Z 2025-12-08 03:45:58,061 - bolt.util - INFO - ",
        "2025-12-08T03:45:58.062187770Z 2025-12-08 03:45:58,062 - bolt.util - INFO - About:   Merge multiple VCF/BCF files from non-overlapping sample sets to create one multi-sample file.",
        "2025-12-08T03:45:58.062278831Z 2025-12-08 03:45:58,062 - bolt.util - INFO - Note that only records from different files can be merged, never from the same file. For",
        "2025-12-08T03:45:58.062366943Z 2025-12-08 03:45:58,062 - bolt.util - INFO - \"vertical\" merge take a look at \"bcftools norm\" instead.",
        "2025-12-08T03:45:58.062465855Z 2025-12-08 03:45:58,062 - bolt.util - INFO - Usage:   bcftools merge [options] <A.vcf.gz> <B.vcf.gz> [...]",
        "2025-12-08T03:45:58.062544646Z 2025-12-08 03:45:58,062 - bolt.util - INFO - ",
        "2025-12-08T03:45:58.062638178Z 2025-12-08 03:45:58,062 - bolt.util - INFO - Options:",
        "2025-12-08T03:45:58.062751910Z 2025-12-08 03:45:58,062 - bolt.util - INFO - --force-samples               Resolve duplicate sample names",
        "2025-12-08T03:45:58.062831162Z 2025-12-08 03:45:58,062 - bolt.util - INFO - --print-header                Print only the merged header and exit",
        "2025-12-08T03:45:58.062936284Z 2025-12-08 03:45:58,062 - bolt.util - INFO - --use-header FILE             Use the provided header",
        "2025-12-08T03:45:58.063035006Z 2025-12-08 03:45:58,062 - bolt.util - INFO - -0  --missing-to-ref              Assume genotypes at missing sites are 0/0",
        "2025-12-08T03:45:58.063149228Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -f, --apply-filters LIST          Require at least one of the listed FILTER strings (e.g. \"PASS,.\")",
        "2025-12-08T03:45:58.063248850Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -F, --filter-logic x|+            Remove filters if some input is PASS (\"x\"), or apply all filters (\"+\") [+]",
        "2025-12-08T03:45:58.063348582Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -g, --gvcf -|REF.FA               Merge gVCF blocks, INFO/END tag is expected. Implies -i QS:sum,MinDP:min,I16:sum,IDV:max,IMF:max",
        "2025-12-08T03:45:58.063471664Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -i, --info-rules TAG:METHOD,..    Rules for merging INFO fields (method is one of sum,avg,min,max,join) or \"-\" to turn off the default [DP:sum,DP4:sum]",
        "2025-12-08T03:45:58.063585266Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -l, --file-list FILE              Read file names from the file",
        "2025-12-08T03:45:58.063685968Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -L, --local-alleles INT           EXPERIMENTAL: if more than <int> ALT alleles are encountered, drop FMT/PL and output LAA+LPL instead; 0=unlimited [0]",
        "2025-12-08T03:45:58.063772159Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -m, --merge STRING                Allow multiallelic records for <snps|indels|both|snp-ins-del|all|none|id>, see man page for details [both]",
        "2025-12-08T03:45:58.063877521Z 2025-12-08 03:45:58,063 - bolt.util - INFO - --no-index                    Merge unindexed files, the same chromosomal order is required and -r/-R are not allowed",
        "2025-12-08T03:45:58.063972873Z 2025-12-08 03:45:58,063 - bolt.util - INFO - --no-version                  Do not append version and command line to the header",
        "2025-12-08T03:45:58.064075205Z 2025-12-08 03:45:58,063 - bolt.util - INFO - -o, --output FILE                 Write output to a file [standard output]",
        "2025-12-08T03:45:58.064164837Z 2025-12-08 03:45:58,064 - bolt.util - INFO - -O, --output-type u|b|v|z[0-9]    u/b: un/compressed BCF, v/z: un/compressed VCF, 0-9: compression level [v]",
        "2025-12-08T03:45:58.064266949Z 2025-12-08 03:45:58,064 - bolt.util - INFO - -r, --regions REGION              Restrict to comma-separated list of regions",
        "2025-12-08T03:45:58.064364401Z 2025-12-08 03:45:58,064 - bolt.util - INFO - -R, --regions-file FILE           Restrict to regions listed in a file",
        "2025-12-08T03:45:58.064513103Z 2025-12-08 03:45:58,064 - bolt.util - INFO - --regions-overlap 0|1|2       Include if POS in the region (0), record overlaps (1), variant overlaps (2) [1]",
        "2025-12-08T03:45:58.064615915Z 2025-12-08 03:45:58,064 - bolt.util - INFO - --threads INT                 Use multithreading with <int> worker threads [0]",
        "2025-12-08T03:45:58.064719867Z 2025-12-08 03:45:58,064 - bolt.util - INFO - ",
        "2025-12-08T03:45:58.065711436Z 2025-12-08 03:45:58,065 - bolt.util - INFO - Merged VCF written to: output/pcgr/nosampleset.pcgr.grch38.unsorted.vcf.gz",
        "2025-12-08T03:45:58.065780207Z 2025-12-08 03:45:58,065 - bolt.util - INFO - Sorting merged VCF file...",
        "2025-12-08T03:45:58.065849648Z 2025-12-08 03:45:58,065 - bolt.util - INFO - Executing command: bcftools sort \\",
        "2025-12-08T03:45:58.065862558Z           -Oz \\",
        "2025-12-08T03:45:58.065869138Z           -o output/pcgr/nosampleset.pcgr.grch38.vcf.gz \\",
        "2025-12-08T03:45:58.065874879Z           output/pcgr/nosampleset.pcgr.grch38.unsorted.vcf.gz",

There are multiple commands likely affected by this, for example this command

When this code chunk is parsed to util.execute_command, we first have the command parsed through command_prepare which wraps set -o pipefail in front of the command, but this means both set -o errexit is not set (aka set -e), along with set -u (equivalent of no unset).

While I understand the benefits of shell=True in our subprocess.run call, to allow for pipelines, this is not recommended as it can lead to shell injection errors.
If you wish to continue using this, I would highly recommend first shlexing the input variables.

Metadata

Metadata

Labels

bugSomething isn't working

Type

No fields configured for Bug.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions