Merge pull request #1720 from RaheelSyedAhmed/bbtools-39.91

erinyoung · web-flow · commit 8a76fbdb468a · 2026-06-26T13:02:14.000-06:00
bbtools v39.91
diff --git a/README.md b/README.md
@@ -128,7 +128,7 @@ To learn more about the docker pull rate limits and the open source software pro
 | [bamtools](https://hub.docker.com/r/staphb/bamtools) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bamtools)](https://hub.docker.com/r/staphb/bamtools) | <details><summary>Click to see all versions</summary> <ul><li>[2.5.3](./build-files/bamtools/2.5.3/)</li></ul> </details> | https://github.com/pezmaster31/bamtools |
 | [bandage](https://hub.docker.com/r/staphb/bandage) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bandage)](https://hub.docker.com/r/staphb/bandage) | <details><summary>Click to see all versions</summary> <ul><li>[0.8.1](./build-files/bandage/0.8.1/)</li><li>[0.9.0](./build-files/bandage/0.9.0/)</li></ul> </details> | https://rrwick.github.io/Bandage/ |
 | [bandage-ng](https://hub.docker.com/r/staphb/bandage-ng) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bandage-ng)](https://hub.docker.com/r/staphb/bandage-ng) | <details><summary>Click to see all versions</summary> <ul><li>[2026.4.1](./build-files/bandage-ng/2026.4.1/)</li></ul> </details> | https://github.com/asl/BandageNG |
-| [BBTools](https://hub.docker.com/r/staphb/bbtools/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bbtools)](https://hub.docker.com/r/staphb/bbtools) | <details><summary>Click to see all versions</summary> <ul><li>[38.76](./build-files/bbtools/38.76/)</li><li>[38.86](./build-files/bbtools/38.86/)</li><li>[38.95](./build-files/bbtools/38.95/)</li><li>[38.96](./build-files/bbtools/38.96/)</li><li>[38.97](./build-files/bbtools/38.97/)</li><li>[38.98](./build-files/bbtools/38.98/)</li><li>[38.99](./build-files/bbtools/38.99/)</li><li>[39.00](./build-files/bbtools/39.00/)</li><li>[39.01](./build-files/bbtools/39.01/)</li><li>[39.06](./build-files/bbtools/39.06/)</li><li>[39.10](./build-files/bbtools/39.10/)</li><li>[39.13](./build-files/bbtools/39.13/)</li><li>[39.16](./build-files/bbtools/39.16/)</li><li>[39.23](./build-files/bbtools/39.23/)</li><li>[39.25](./build-files/bbtools/39.25/)</li><li>[39.33](./build-files/bbtools/39.33/)</li><li>[39.34](./build-files/bbtools/39.34/)</li><li>[39.38](./build-files/bbtools/39.38/)</li><li>[39.49](./build-files/bbtools/39.49/)</li><li>[39.60](./build-files/bbtools/39.60/)</li><li>[39.68](./build-files/bbtools/39.68/)</li><li>[39.75](./build-files/bbtools/39.75/)</li><li>[39.77](./build-files/bbtools/39.77/)</li><li>[39.81](./build-files/bbtools/39.81/)</li><li>[39.83](./build-files/bbtools/39.83/)</li><li>[39.84](./build-files/bbtools/39.84/)</li></ul></details> | https://bbmap.org/ |
+| [BBTools](https://hub.docker.com/r/staphb/bbtools/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bbtools)](https://hub.docker.com/r/staphb/bbtools) | <details><summary>Click to see all versions</summary> <ul><li>[38.76](./build-files/bbtools/38.76/)</li><li>[38.86](./build-files/bbtools/38.86/)</li><li>[38.95](./build-files/bbtools/38.95/)</li><li>[38.96](./build-files/bbtools/38.96/)</li><li>[38.97](./build-files/bbtools/38.97/)</li><li>[38.98](./build-files/bbtools/38.98/)</li><li>[38.99](./build-files/bbtools/38.99/)</li><li>[39.00](./build-files/bbtools/39.00/)</li><li>[39.01](./build-files/bbtools/39.01/)</li><li>[39.06](./build-files/bbtools/39.06/)</li><li>[39.10](./build-files/bbtools/39.10/)</li><li>[39.13](./build-files/bbtools/39.13/)</li><li>[39.16](./build-files/bbtools/39.16/)</li><li>[39.23](./build-files/bbtools/39.23/)</li><li>[39.25](./build-files/bbtools/39.25/)</li><li>[39.33](./build-files/bbtools/39.33/)</li><li>[39.34](./build-files/bbtools/39.34/)</li><li>[39.38](./build-files/bbtools/39.38/)</li><li>[39.49](./build-files/bbtools/39.49/)</li><li>[39.60](./build-files/bbtools/39.60/)</li><li>[39.68](./build-files/bbtools/39.68/)</li><li>[39.75](./build-files/bbtools/39.75/)</li><li>[39.77](./build-files/bbtools/39.77/)</li><li>[39.81](./build-files/bbtools/39.81/)</li><li>[39.83](./build-files/bbtools/39.83/)</li><li>[39.84](./build-files/bbtools/39.84/)</li><li>[39.91](./build-files/bbtools/39.91/)</li></ul></details> | https://bbmap.org/ |
 | [bcftools](https://hub.docker.com/r/staphb/bcftools/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bcftools)](https://hub.docker.com/r/staphb/bcftools) | <details><summary>Click to see all versions</summary> <ul><li>[1.10.2](./build-files/bcftools/1.10.2/)</li><li>[1.11](./build-files/bcftools/1.11/)</li><li>[1.12](./build-files/bcftools/1.12/)</li><li>[1.13](./build-files/bcftools/1.13/)</li><li>[1.14](./build-files/bcftools/1.14/)</li><li>[1.15](./build-files/bcftools/1.15/)</li><li>[1.16](./build-files/bcftools/1.16/)</li><li>[1.17](./build-files/bcftools/1.17/)</li><li>[1.18](./build-files/bcftools/1.18/)</li><li>[1.19](./build-files/bcftools/1.19/)</li><li>[1.20](./build-files/bcftools/1.20/)</li><li>[1.20.c](./build-files/bcftools/1.20.c/)</li><li>[1.21](./build-files/bcftools/1.21/)</li><li>[1.22](./build-files/bcftools/1.22/)</li><li>[1.23](./build-files/bcftools/1.23/)</li><li>[1.23.1](./build-files/bcftools/1.23.1/)</li></ul> </details> | https://github.com/samtools/bcftools |
 | [bedtools](https://hub.docker.com/r/staphb/bedtools/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bedtools)](https://hub.docker.com/r/staphb/bedtools) | <details><summary>Click to see all versions</summary> <ul><li>[2.29.2](./build-files/bedtools/2.29.2/)</li><li>[2.30.0](./build-files/bedtools/2.30.0/)</li><li>[2.31.0](./build-files/bedtools/2.31.0/)</li><li>[2.31.1](./build-files/bedtools/2.31.1/)</li></ul> </details> | https://bedtools.readthedocs.io/en/latest/ <br/>https://github.com/arq5x/bedtools2 |
 | [bedder-rs](https://hub.docker.com/r/staphb/bedder-rs/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bedder-rs)](https://hub.docker.com/r/staphb/bedder-rs) | <details><summary>Click to see all versions</summary> <ul><li>[0.1.14](./build-files/bedder-rs/0.1.14/)</li></ul> </details> | https://brentp.github.io/bedder-docs/latest/ <br/>https://github.com/quinlan-lab/bedder-rs |
diff --git a/build-files/bbtools/39.91/Dockerfile b/build-files/bbtools/39.91/Dockerfile
@@ -0,0 +1,79 @@
+FROM staphb/samtools:1.23.1 AS samtools
+FROM staphb/htslib:1.23.1 AS htslib
+
+# As a reminder
+# https://github.com/StaPH-B/docker-builds/pull/925#issuecomment-2010553275
+# bbmap/docs/TableOfContents.txt lists additional dependencies
+
+FROM ubuntu:noble AS app
+
+ARG SAMBAMBAVER=1.0.1
+ARG BBTOOLSVER=39.91
+
+LABEL base.image="ubuntu:noble"
+LABEL dockerfile.version="1"
+LABEL software="BBTools"
+LABEL software.version=${BBTOOLSVER}
+LABEL description="A set of tools labeled as \"Bestus Bioinformaticus\""
+LABEL website="https://github.com/bbushnell/BBTools"
+LABEL documentation="https://bbmap.org/"
+LABEL license="https://github.com/bbushnell/BBTools/blob/master/license.txt"
+LABEL maintainer="Abigail Shockey"
+LABEL maintainer.email="abigail.shockey@slh.wisc.edu"
+LABEL maintainer2="Padraic Fanning"
+LABEL maintainer2.email="faninnpm AT miamioh DOT edu"
+
+RUN apt-get update && \
+    apt-get install --no-install-recommends -y \
+    openjdk-25-jre-headless \
+    pigz \
+    pbzip2 \
+    lbzip2 \
+    bzip2 \
+    libcurl4-gnutls-dev \
+    libdeflate-dev \
+    wget \
+    ca-certificates \
+    procps && \
+    rm -rf /var/lib/apt/lists/* && \
+    apt-get autoclean
+
+# copy samtools to image
+COPY --from=samtools /usr/local/bin/*    /usr/local/bin/
+COPY --from=htslib   /usr/local/bin/*    /usr/local/bin/
+COPY --from=htslib   /usr/local/lib/     /usr/local/lib/
+COPY --from=htslib   /usr/local/include/ /usr/local/include/
+
+# download and install sambamba
+RUN wget -q https://github.com/biod/sambamba/releases/download/v${SAMBAMBAVER}/sambamba-${SAMBAMBAVER}-linux-amd64-static.gz && \
+    gzip -d sambamba-${SAMBAMBAVER}-linux-amd64-static.gz && \
+    mv sambamba-${SAMBAMBAVER}-linux-amd64-static /usr/local/bin/sambamba && \
+    chmod +x /usr/local/bin/sambamba
+
+# download and install bbtools
+RUN wget -q https://sourceforge.net/projects/bbmap/files/BBMap_${BBTOOLSVER}.tar.gz && \
+    tar -xzf BBMap_${BBTOOLSVER}.tar.gz && \
+    rm BBMap_${BBTOOLSVER}.tar.gz && \
+    mkdir /data
+
+
+ENV PATH=/bbmap/:$PATH \
+    LC_ALL=C
+
+SHELL ["/bin/bash", "-c"]
+
+CMD ["tail", "-n", "90", "/bbmap/docs/TableOfContents.txt"]
+
+WORKDIR /data
+
+# testing
+FROM app AS test
+
+WORKDIR /test
+
+RUN tail -n 90 /bbmap/docs/TableOfContents.txt
+
+# get test data and test one thing that uses samtools/sambamba
+RUN wget -q https://raw.githubusercontent.com/StaPH-B/docker-builds/master/tests/SARS-CoV-2/SRR13957123.primertrim.sorted.bam && \
+    streamsam.sh in='SRR13957123.primertrim.sorted.bam' out='test_SRR13957123.primertrim.sorted.fastq.gz' && \
+    test -f test_SRR13957123.primertrim.sorted.fastq.gz
diff --git a/build-files/bbtools/39.91/README.md b/build-files/bbtools/39.91/README.md
@@ -0,0 +1,119 @@
+# BBTools container
+
+Main tool: [BBTools](https://bbmap.org/)
+  
+Code repository: https://sourceforge.net/projects/bbmap/ and https://github.com/bbushnell/BBTools
+
+Additional tools:
+
+- samtools: 1.23.1
+- htslib: 1.23.1
+- sambamba: 1.0.1
+
+Basic information on how to use this tool:
+
+- executable: `*.sh`
+- help: Program descriptions and options are shown when running the shell scripts with no parameters.
+- version: `--version`
+- description: 
+> BBTools is a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data. BBTools can handle common sequencing file formats such as fastq, fasta, sam, scarf, fasta+qual, compressed or raw, with autodetection of quality encoding and interleaving.
+
+Additional information:
+
+| Script | Purpose | Comment |
+|--------|---------|---------|
+| **bbcms.sh** | Performs error correction using a Count-Min Sketch | Intended for metagenome assembly |
+| **bbcountunique.sh** | Counts unique kmers in reads | |
+| **bbduk.sh** | Trims, filters or masks reads using kmers | |
+| **bbmap.sh** | Splice-aware aligner for short reads | |
+| **bbmapskimmer.sh** | BBMap version designed for high levels of multimapping | |
+| **bbmask.sh** | Masks references based on various things, such as sequence complexity | |
+| **bbmerge.sh** | Merges overlapping paired reads | |
+| **bbmerge-auto.sh** | Same as bbmerge, but tries to allocate all memory on the node | Use this version for kmer operations like extend |
+| **bbnorm.sh** | Normalizes reads based on coverage | Mainly for use prior to single-cell assembly |
+| **bbsplit.sh** | BBMap version that maps to multiple references simultaneously | Intended for decontamination; similar to Seal |
+| **bbversion.sh** | Prints the version of BBTools | |
+| **bbwrap.sh** | Wraps BBMap to process many files using same reference | Saves time by loading the index only once |
+| **calctruequality.sh** | Allows recalibration of quality scores from mapped reads | This generates the correction matrix; BBDuk does the recalibration |
+| **callgenes.sh** | Fast prokaryotic gene caller | Integrated into BBSketch |
+| **callvariants.sh** | Fast variant caller | |
+| **callvariants2.sh** | Same as callvariants.sh with the "multisample" flag | |
+| **clumpify.sh** | Shrinks compressed fastq files, and can remove duplicate reads | Also supports error correction |
+| **comparesketch.sh** | Compares sketches locally, without using a sketch server | |
+| **crossblock.sh** | Alias for decontaminate.sh | |
+| **cutgff.sh** | Cuts out features defined by gff file | E.g, generates one fasta entry per gene from a gff and an assembly |
+| **cutprimers.sh** | Cuts out subregions of ribosomes | Mainly for 16S analysis |
+| **decontaminate.sh** | Pool-level decontamination for single-cell MDA-amplified genomes | |
+| **dedupe.sh** | Removes duplicate and fully-contained sequences | Can also be used to cluster 16S sequences |
+| **dedupe2.sh** | Version of dedupe that supports more hash keys for greater sensitivity | |
+| **dedupebymapping.sh** | Deduplicates reads based on mapping coordinates | |
+| **demuxbyname.sh** | Demultiplexes based on sequences headers | |
+| **filterbyname.sh** | Filters based on sequence headers | |
+| **filterbytaxa.sh** | Filters sequences based on taxonomic classification | Used with NCBI datasets |
+| **filterbytile.sh** | Removes reads that are in low quality areas on flowcell | |
+| **filterqc.sh** | Part of JGI's fastq filtering pipeline | |
+| **filtersam.sh** | Filters sam files to remove reads with multiple unsupported mismatches | Designed for NovaSeq |
+| **gitable.sh** | Used to process NCBI taxonomy data | |
+| **khist.sh** | Alias for bbnorm.sh with flags for making a kmer frequency histogram | |
+| **kmercountexact.sh** | Counts kmers and produces a histogram | Uses more memory than BBNorm but allows exact counts |
+| **kmercountmulti.sh** | Cardinality estimation over multiple kmer lengths | Uses LogLog; does not produce a histogram |
+| **mapPacBio.sh** | BBMap version designed for PacBio or Nanopore reads | Reads longer than 5kbp get broken into 5kbp shreds |
+| **mergesketch.sh** | Allows multiple sketches to be combined | |
+| **msa.sh** | Alignment tool | Used with cutprimers.sh to cut subsections out of 16s |
+| **mutate.sh** | Generates synthetic genomes by randomly mutating the input | |
+| **muxbyname.sh** | Multiplex multiple files, renaming sequences based on input file name | Opposite of demuxbyname.sh |
+| **partition.sh** | Splits a sequence file into multiple files | |
+| **pileup.sh** | Calculates coverage from sam files | |
+| **plotflowcell.sh** | Produces statistics about flowcell positions | |
+| **processhi-c.sh** | Custom trimming for hi-C reads | In development |
+| **randomreads.sh** | Generates synthetic data from real genome reference | Highly customizable |
+| **readqc.sh** | Short read quality report | Alternative to fastqc |
+| **reformat.sh** | Converts sequence files to another format | Has many additional options, includes subsampling |
+| **rename.sh** | Renames sequences in various ways, such as adding a prefix | |
+| **repair.sh** | Fixes broken pairing in fastq files | |
+| **representative.sh** | Makes a smaller subset of a reference dataset by eliminating redundancy | Designed for use with BBSketch output |
+| **rqcfilter2.sh** | Filtering pipeline used at JGI | portal.nersc.gov/dna/microbial/assembly/bushnell/RQCFilterData.tar |
+| **seal.sh** | Counts kmer matches between query and reference sequences | |
+| **sendsketch.sh** | Fast taxonomic classifier using webservers at JGI | |
+| **shred.sh** | Breaks sequences into shorter, fixed-length pieces | |
+| **shuffle.sh** | Randomly reorders input file | Crashes if input doesn't fit in memory |
+| **shuffle2.sh** | Randomly reorders input file | Supports larger files, but output might be less random |
+| **sketch.sh** | Makes reference sketches on a per-TaxID basis | |
+| **sketchblacklist.sh** | Makes sketch blacklists of common kmers | |
+| **sortbyname.sh** | Sorts sequences by name, length, quality, taxa, and other things | |
+| **summarizequast.sh** | Generates box plots for multiple quast reports | |
+| **tadpipe.sh** | Preprocessing and assembly pipeline using tadpole | |
+| **tadpole.sh** | Fast short read assembler | |
+| **tadwrapper.sh** | Runs Tadpole with multiple kmer lengths to select the best assembly | |
+| **taxserver.sh** | Starts taxonomy and sketch servers | |
+| **testformat.sh** | Determines if file is fasta, fastq, interleaved, etc. by reading first few lines | |
+| **testformat2.sh** | Generates extensive statistics by reading the full file | |
+| **translate6frames.sh** | Translates nucleotide sequence into amino acid sequence in all frames | |
+| **vcf2gff.sh** | Converts vcf format to gff format | |
+
+
+Full documentation: https://bbmap.org/docs
+
+## Example Usage
+
+(adapted from `/opt/bbmap/pipelines/covid/processCorona.sh`)
+
+Interleave a pair of FASTQ files for downstream processing:
+
+```text
+reformat.sh \
+    in1=${SAMPLE}_R1.fastq.gz \
+    in2=${SAMPLE}_R2.fastq.gz \
+    out=${SAMPLE}.fastq.gz
+```
+
+Split into SARS-CoV-2 and non-SARS-CoV-2 reads:
+
+```text
+bbduk.sh ow -Xmx1g \
+    in=${SAMPLE}.fq.gz \
+    ref=REFERENCE.fasta \
+    outm=${SAMPLE}_viral.fq.gz \
+    outu=${SAMPLE}_nonviral.fq.gz \
+    k=25
+```