Skip to content

Commit 8a76fbd

Browse files
authored
Merge pull request #1720 from RaheelSyedAhmed/bbtools-39.91
bbtools v39.91
2 parents 62bdad6 + d97ef42 commit 8a76fbd

3 files changed

Lines changed: 199 additions & 1 deletion

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ To learn more about the docker pull rate limits and the open source software pro
128128
| [bamtools](https://hub.docker.com/r/staphb/bamtools) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bamtools)](https://hub.docker.com/r/staphb/bamtools) | <details><summary>Click to see all versions</summary> <ul><li>[2.5.3](./build-files/bamtools/2.5.3/)</li></ul> </details> | https://github.com/pezmaster31/bamtools |
129129
| [bandage](https://hub.docker.com/r/staphb/bandage) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bandage)](https://hub.docker.com/r/staphb/bandage) | <details><summary>Click to see all versions</summary> <ul><li>[0.8.1](./build-files/bandage/0.8.1/)</li><li>[0.9.0](./build-files/bandage/0.9.0/)</li></ul> </details> | https://rrwick.github.io/Bandage/ |
130130
| [bandage-ng](https://hub.docker.com/r/staphb/bandage-ng) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bandage-ng)](https://hub.docker.com/r/staphb/bandage-ng) | <details><summary>Click to see all versions</summary> <ul><li>[2026.4.1](./build-files/bandage-ng/2026.4.1/)</li></ul> </details> | https://github.com/asl/BandageNG |
131-
| [BBTools](https://hub.docker.com/r/staphb/bbtools/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bbtools)](https://hub.docker.com/r/staphb/bbtools) | <details><summary>Click to see all versions</summary> <ul><li>[38.76](./build-files/bbtools/38.76/)</li><li>[38.86](./build-files/bbtools/38.86/)</li><li>[38.95](./build-files/bbtools/38.95/)</li><li>[38.96](./build-files/bbtools/38.96/)</li><li>[38.97](./build-files/bbtools/38.97/)</li><li>[38.98](./build-files/bbtools/38.98/)</li><li>[38.99](./build-files/bbtools/38.99/)</li><li>[39.00](./build-files/bbtools/39.00/)</li><li>[39.01](./build-files/bbtools/39.01/)</li><li>[39.06](./build-files/bbtools/39.06/)</li><li>[39.10](./build-files/bbtools/39.10/)</li><li>[39.13](./build-files/bbtools/39.13/)</li><li>[39.16](./build-files/bbtools/39.16/)</li><li>[39.23](./build-files/bbtools/39.23/)</li><li>[39.25](./build-files/bbtools/39.25/)</li><li>[39.33](./build-files/bbtools/39.33/)</li><li>[39.34](./build-files/bbtools/39.34/)</li><li>[39.38](./build-files/bbtools/39.38/)</li><li>[39.49](./build-files/bbtools/39.49/)</li><li>[39.60](./build-files/bbtools/39.60/)</li><li>[39.68](./build-files/bbtools/39.68/)</li><li>[39.75](./build-files/bbtools/39.75/)</li><li>[39.77](./build-files/bbtools/39.77/)</li><li>[39.81](./build-files/bbtools/39.81/)</li><li>[39.83](./build-files/bbtools/39.83/)</li><li>[39.84](./build-files/bbtools/39.84/)</li></ul></details> | https://bbmap.org/ |
131+
| [BBTools](https://hub.docker.com/r/staphb/bbtools/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bbtools)](https://hub.docker.com/r/staphb/bbtools) | <details><summary>Click to see all versions</summary> <ul><li>[38.76](./build-files/bbtools/38.76/)</li><li>[38.86](./build-files/bbtools/38.86/)</li><li>[38.95](./build-files/bbtools/38.95/)</li><li>[38.96](./build-files/bbtools/38.96/)</li><li>[38.97](./build-files/bbtools/38.97/)</li><li>[38.98](./build-files/bbtools/38.98/)</li><li>[38.99](./build-files/bbtools/38.99/)</li><li>[39.00](./build-files/bbtools/39.00/)</li><li>[39.01](./build-files/bbtools/39.01/)</li><li>[39.06](./build-files/bbtools/39.06/)</li><li>[39.10](./build-files/bbtools/39.10/)</li><li>[39.13](./build-files/bbtools/39.13/)</li><li>[39.16](./build-files/bbtools/39.16/)</li><li>[39.23](./build-files/bbtools/39.23/)</li><li>[39.25](./build-files/bbtools/39.25/)</li><li>[39.33](./build-files/bbtools/39.33/)</li><li>[39.34](./build-files/bbtools/39.34/)</li><li>[39.38](./build-files/bbtools/39.38/)</li><li>[39.49](./build-files/bbtools/39.49/)</li><li>[39.60](./build-files/bbtools/39.60/)</li><li>[39.68](./build-files/bbtools/39.68/)</li><li>[39.75](./build-files/bbtools/39.75/)</li><li>[39.77](./build-files/bbtools/39.77/)</li><li>[39.81](./build-files/bbtools/39.81/)</li><li>[39.83](./build-files/bbtools/39.83/)</li><li>[39.84](./build-files/bbtools/39.84/)</li><li>[39.91](./build-files/bbtools/39.91/)</li></ul></details> | https://bbmap.org/ |
132132
| [bcftools](https://hub.docker.com/r/staphb/bcftools/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bcftools)](https://hub.docker.com/r/staphb/bcftools) | <details><summary>Click to see all versions</summary> <ul><li>[1.10.2](./build-files/bcftools/1.10.2/)</li><li>[1.11](./build-files/bcftools/1.11/)</li><li>[1.12](./build-files/bcftools/1.12/)</li><li>[1.13](./build-files/bcftools/1.13/)</li><li>[1.14](./build-files/bcftools/1.14/)</li><li>[1.15](./build-files/bcftools/1.15/)</li><li>[1.16](./build-files/bcftools/1.16/)</li><li>[1.17](./build-files/bcftools/1.17/)</li><li>[1.18](./build-files/bcftools/1.18/)</li><li>[1.19](./build-files/bcftools/1.19/)</li><li>[1.20](./build-files/bcftools/1.20/)</li><li>[1.20.c](./build-files/bcftools/1.20.c/)</li><li>[1.21](./build-files/bcftools/1.21/)</li><li>[1.22](./build-files/bcftools/1.22/)</li><li>[1.23](./build-files/bcftools/1.23/)</li><li>[1.23.1](./build-files/bcftools/1.23.1/)</li></ul> </details> | https://github.com/samtools/bcftools |
133133
| [bedtools](https://hub.docker.com/r/staphb/bedtools/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bedtools)](https://hub.docker.com/r/staphb/bedtools) | <details><summary>Click to see all versions</summary> <ul><li>[2.29.2](./build-files/bedtools/2.29.2/)</li><li>[2.30.0](./build-files/bedtools/2.30.0/)</li><li>[2.31.0](./build-files/bedtools/2.31.0/)</li><li>[2.31.1](./build-files/bedtools/2.31.1/)</li></ul> </details> | https://bedtools.readthedocs.io/en/latest/ <br/>https://github.com/arq5x/bedtools2 |
134134
| [bedder-rs](https://hub.docker.com/r/staphb/bedder-rs/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bedder-rs)](https://hub.docker.com/r/staphb/bedder-rs) | <details><summary>Click to see all versions</summary> <ul><li>[0.1.14](./build-files/bedder-rs/0.1.14/)</li></ul> </details> | https://brentp.github.io/bedder-docs/latest/ <br/>https://github.com/quinlan-lab/bedder-rs |
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
FROM staphb/samtools:1.23.1 AS samtools
2+
FROM staphb/htslib:1.23.1 AS htslib
3+
4+
# As a reminder
5+
# https://github.com/StaPH-B/docker-builds/pull/925#issuecomment-2010553275
6+
# bbmap/docs/TableOfContents.txt lists additional dependencies
7+
8+
FROM ubuntu:noble AS app
9+
10+
ARG SAMBAMBAVER=1.0.1
11+
ARG BBTOOLSVER=39.91
12+
13+
LABEL base.image="ubuntu:noble"
14+
LABEL dockerfile.version="1"
15+
LABEL software="BBTools"
16+
LABEL software.version=${BBTOOLSVER}
17+
LABEL description="A set of tools labeled as \"Bestus Bioinformaticus\""
18+
LABEL website="https://github.com/bbushnell/BBTools"
19+
LABEL documentation="https://bbmap.org/"
20+
LABEL license="https://github.com/bbushnell/BBTools/blob/master/license.txt"
21+
LABEL maintainer="Abigail Shockey"
22+
LABEL maintainer.email="abigail.shockey@slh.wisc.edu"
23+
LABEL maintainer2="Padraic Fanning"
24+
LABEL maintainer2.email="faninnpm AT miamioh DOT edu"
25+
26+
RUN apt-get update && \
27+
apt-get install --no-install-recommends -y \
28+
openjdk-25-jre-headless \
29+
pigz \
30+
pbzip2 \
31+
lbzip2 \
32+
bzip2 \
33+
libcurl4-gnutls-dev \
34+
libdeflate-dev \
35+
wget \
36+
ca-certificates \
37+
procps && \
38+
rm -rf /var/lib/apt/lists/* && \
39+
apt-get autoclean
40+
41+
# copy samtools to image
42+
COPY --from=samtools /usr/local/bin/* /usr/local/bin/
43+
COPY --from=htslib /usr/local/bin/* /usr/local/bin/
44+
COPY --from=htslib /usr/local/lib/ /usr/local/lib/
45+
COPY --from=htslib /usr/local/include/ /usr/local/include/
46+
47+
# download and install sambamba
48+
RUN wget -q https://github.com/biod/sambamba/releases/download/v${SAMBAMBAVER}/sambamba-${SAMBAMBAVER}-linux-amd64-static.gz && \
49+
gzip -d sambamba-${SAMBAMBAVER}-linux-amd64-static.gz && \
50+
mv sambamba-${SAMBAMBAVER}-linux-amd64-static /usr/local/bin/sambamba && \
51+
chmod +x /usr/local/bin/sambamba
52+
53+
# download and install bbtools
54+
RUN wget -q https://sourceforge.net/projects/bbmap/files/BBMap_${BBTOOLSVER}.tar.gz && \
55+
tar -xzf BBMap_${BBTOOLSVER}.tar.gz && \
56+
rm BBMap_${BBTOOLSVER}.tar.gz && \
57+
mkdir /data
58+
59+
60+
ENV PATH=/bbmap/:$PATH \
61+
LC_ALL=C
62+
63+
SHELL ["/bin/bash", "-c"]
64+
65+
CMD ["tail", "-n", "90", "/bbmap/docs/TableOfContents.txt"]
66+
67+
WORKDIR /data
68+
69+
# testing
70+
FROM app AS test
71+
72+
WORKDIR /test
73+
74+
RUN tail -n 90 /bbmap/docs/TableOfContents.txt
75+
76+
# get test data and test one thing that uses samtools/sambamba
77+
RUN wget -q https://raw.githubusercontent.com/StaPH-B/docker-builds/master/tests/SARS-CoV-2/SRR13957123.primertrim.sorted.bam && \
78+
streamsam.sh in='SRR13957123.primertrim.sorted.bam' out='test_SRR13957123.primertrim.sorted.fastq.gz' && \
79+
test -f test_SRR13957123.primertrim.sorted.fastq.gz
Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
# BBTools container
2+
3+
Main tool: [BBTools](https://bbmap.org/)
4+
5+
Code repository: https://sourceforge.net/projects/bbmap/ and https://github.com/bbushnell/BBTools
6+
7+
Additional tools:
8+
9+
- samtools: 1.23.1
10+
- htslib: 1.23.1
11+
- sambamba: 1.0.1
12+
13+
Basic information on how to use this tool:
14+
15+
- executable: `*.sh`
16+
- help: Program descriptions and options are shown when running the shell scripts with no parameters.
17+
- version: `--version`
18+
- description:
19+
> BBTools is a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data. BBTools can handle common sequencing file formats such as fastq, fasta, sam, scarf, fasta+qual, compressed or raw, with autodetection of quality encoding and interleaving.
20+
21+
Additional information:
22+
23+
| Script | Purpose | Comment |
24+
|--------|---------|---------|
25+
| **bbcms.sh** | Performs error correction using a Count-Min Sketch | Intended for metagenome assembly |
26+
| **bbcountunique.sh** | Counts unique kmers in reads | |
27+
| **bbduk.sh** | Trims, filters or masks reads using kmers | |
28+
| **bbmap.sh** | Splice-aware aligner for short reads | |
29+
| **bbmapskimmer.sh** | BBMap version designed for high levels of multimapping | |
30+
| **bbmask.sh** | Masks references based on various things, such as sequence complexity | |
31+
| **bbmerge.sh** | Merges overlapping paired reads | |
32+
| **bbmerge-auto.sh** | Same as bbmerge, but tries to allocate all memory on the node | Use this version for kmer operations like extend |
33+
| **bbnorm.sh** | Normalizes reads based on coverage | Mainly for use prior to single-cell assembly |
34+
| **bbsplit.sh** | BBMap version that maps to multiple references simultaneously | Intended for decontamination; similar to Seal |
35+
| **bbversion.sh** | Prints the version of BBTools | |
36+
| **bbwrap.sh** | Wraps BBMap to process many files using same reference | Saves time by loading the index only once |
37+
| **calctruequality.sh** | Allows recalibration of quality scores from mapped reads | This generates the correction matrix; BBDuk does the recalibration |
38+
| **callgenes.sh** | Fast prokaryotic gene caller | Integrated into BBSketch |
39+
| **callvariants.sh** | Fast variant caller | |
40+
| **callvariants2.sh** | Same as callvariants.sh with the "multisample" flag | |
41+
| **clumpify.sh** | Shrinks compressed fastq files, and can remove duplicate reads | Also supports error correction |
42+
| **comparesketch.sh** | Compares sketches locally, without using a sketch server | |
43+
| **crossblock.sh** | Alias for decontaminate.sh | |
44+
| **cutgff.sh** | Cuts out features defined by gff file | E.g, generates one fasta entry per gene from a gff and an assembly |
45+
| **cutprimers.sh** | Cuts out subregions of ribosomes | Mainly for 16S analysis |
46+
| **decontaminate.sh** | Pool-level decontamination for single-cell MDA-amplified genomes | |
47+
| **dedupe.sh** | Removes duplicate and fully-contained sequences | Can also be used to cluster 16S sequences |
48+
| **dedupe2.sh** | Version of dedupe that supports more hash keys for greater sensitivity | |
49+
| **dedupebymapping.sh** | Deduplicates reads based on mapping coordinates | |
50+
| **demuxbyname.sh** | Demultiplexes based on sequences headers | |
51+
| **filterbyname.sh** | Filters based on sequence headers | |
52+
| **filterbytaxa.sh** | Filters sequences based on taxonomic classification | Used with NCBI datasets |
53+
| **filterbytile.sh** | Removes reads that are in low quality areas on flowcell | |
54+
| **filterqc.sh** | Part of JGI's fastq filtering pipeline | |
55+
| **filtersam.sh** | Filters sam files to remove reads with multiple unsupported mismatches | Designed for NovaSeq |
56+
| **gitable.sh** | Used to process NCBI taxonomy data | |
57+
| **khist.sh** | Alias for bbnorm.sh with flags for making a kmer frequency histogram | |
58+
| **kmercountexact.sh** | Counts kmers and produces a histogram | Uses more memory than BBNorm but allows exact counts |
59+
| **kmercountmulti.sh** | Cardinality estimation over multiple kmer lengths | Uses LogLog; does not produce a histogram |
60+
| **mapPacBio.sh** | BBMap version designed for PacBio or Nanopore reads | Reads longer than 5kbp get broken into 5kbp shreds |
61+
| **mergesketch.sh** | Allows multiple sketches to be combined | |
62+
| **msa.sh** | Alignment tool | Used with cutprimers.sh to cut subsections out of 16s |
63+
| **mutate.sh** | Generates synthetic genomes by randomly mutating the input | |
64+
| **muxbyname.sh** | Multiplex multiple files, renaming sequences based on input file name | Opposite of demuxbyname.sh |
65+
| **partition.sh** | Splits a sequence file into multiple files | |
66+
| **pileup.sh** | Calculates coverage from sam files | |
67+
| **plotflowcell.sh** | Produces statistics about flowcell positions | |
68+
| **processhi-c.sh** | Custom trimming for hi-C reads | In development |
69+
| **randomreads.sh** | Generates synthetic data from real genome reference | Highly customizable |
70+
| **readqc.sh** | Short read quality report | Alternative to fastqc |
71+
| **reformat.sh** | Converts sequence files to another format | Has many additional options, includes subsampling |
72+
| **rename.sh** | Renames sequences in various ways, such as adding a prefix | |
73+
| **repair.sh** | Fixes broken pairing in fastq files | |
74+
| **representative.sh** | Makes a smaller subset of a reference dataset by eliminating redundancy | Designed for use with BBSketch output |
75+
| **rqcfilter2.sh** | Filtering pipeline used at JGI | portal.nersc.gov/dna/microbial/assembly/bushnell/RQCFilterData.tar |
76+
| **seal.sh** | Counts kmer matches between query and reference sequences | |
77+
| **sendsketch.sh** | Fast taxonomic classifier using webservers at JGI | |
78+
| **shred.sh** | Breaks sequences into shorter, fixed-length pieces | |
79+
| **shuffle.sh** | Randomly reorders input file | Crashes if input doesn't fit in memory |
80+
| **shuffle2.sh** | Randomly reorders input file | Supports larger files, but output might be less random |
81+
| **sketch.sh** | Makes reference sketches on a per-TaxID basis | |
82+
| **sketchblacklist.sh** | Makes sketch blacklists of common kmers | |
83+
| **sortbyname.sh** | Sorts sequences by name, length, quality, taxa, and other things | |
84+
| **summarizequast.sh** | Generates box plots for multiple quast reports | |
85+
| **tadpipe.sh** | Preprocessing and assembly pipeline using tadpole | |
86+
| **tadpole.sh** | Fast short read assembler | |
87+
| **tadwrapper.sh** | Runs Tadpole with multiple kmer lengths to select the best assembly | |
88+
| **taxserver.sh** | Starts taxonomy and sketch servers | |
89+
| **testformat.sh** | Determines if file is fasta, fastq, interleaved, etc. by reading first few lines | |
90+
| **testformat2.sh** | Generates extensive statistics by reading the full file | |
91+
| **translate6frames.sh** | Translates nucleotide sequence into amino acid sequence in all frames | |
92+
| **vcf2gff.sh** | Converts vcf format to gff format | |
93+
94+
95+
Full documentation: https://bbmap.org/docs
96+
97+
## Example Usage
98+
99+
(adapted from `/opt/bbmap/pipelines/covid/processCorona.sh`)
100+
101+
Interleave a pair of FASTQ files for downstream processing:
102+
103+
```text
104+
reformat.sh \
105+
in1=${SAMPLE}_R1.fastq.gz \
106+
in2=${SAMPLE}_R2.fastq.gz \
107+
out=${SAMPLE}.fastq.gz
108+
```
109+
110+
Split into SARS-CoV-2 and non-SARS-CoV-2 reads:
111+
112+
```text
113+
bbduk.sh ow -Xmx1g \
114+
in=${SAMPLE}.fq.gz \
115+
ref=REFERENCE.fasta \
116+
outm=${SAMPLE}_viral.fq.gz \
117+
outu=${SAMPLE}_nonviral.fq.gz \
118+
k=25
119+
```

0 commit comments

Comments
 (0)