- Programs installed/being installed
| Property | value |
|---|---|
| prog_name | samtools |
| publication | https://www.ncbi.nlm.nih.gov/pubmed/19505943 |
| citations_num | 18873 (2019.05.07) |
| first_release_year | 2009? |
| www | http://www.htslib.org/ |
| repo | https://github.com/samtools/samtools |
| lang | C |
| obtained_from | https://github.com/samtools/samtools/releases/download/1.9/samtools-1.9.tar.bz2 |
| installed_version | 1.9 |
| installed_version_date | 2018.07.18 |
| newest_version | 1.12 |
| newest_version_date | 2021.03.17 |
| last_ver_check | 2021.03.18 |
| requirements_1 | cc/gcc |
| install_1 | foo-server |
| install_1_dir | /usr/local/bin/samtools |
Use libdeflate to build htslib first. Link: https://github.com/ebiggers/libdeflate
- view/sort SAM/BAM/CRAM files
- index fasta
| Property | value |
|---|---|
| prog_name | picard |
| publication | nope |
| citations_num | ??? |
| first_release_year | ??? |
| www | http://broadinstitute.github.io/picard/ |
| repo | https://github.com/broadinstitute/picard |
| lang | java |
| obtained_from | https://github.com/broadinstitute/picard/releases/download/2.21.2/picard.jar |
| installed_version | 2.21.2 |
| installed_version_date | 2019.10.29 |
| newest_version | 2.25.5 |
| newest_version_date | 2021.05.18 |
| last_ver_check | 2021.05.18 |
| requirements_1 | java 1.8 |
| documentation | http://broadinstitute.github.io/picard/ |
| install_1 | foo-server |
| install_1_dir | /opt/soft/picard_current/ |
| install_1_admin | darked |
| install_2 | bar-server |
| install_2_dir | /opt/soft/picard_current/ (not updated recently) |
| install_2_admin | darked |
- view/sort SAM/BAM/CRAM files
- index fasta
| Property | value |
|---|---|
| prog_name | sambamba |
| publication | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765878/ |
| citations_num | 185 (2019.05.07) |
| first_release_year | 2012? |
| www | http://lomereiter.github.io/sambamba/ |
| repo | https://github.com/biod/sambamba |
| lang | Dlang |
| obtained_from | https://github.com/biod/sambamba/releases/download/v0.7.0/sambamba-0.7.0-linux-static.gz |
| installed_version | 0.7.0 |
| installed_version_date | 2019.05.29 |
| newest_version | 0.8.0 |
| newest_version_date | 2020.11.30 |
| last_ver_check | 2020.12.29 |
| requirements_1 | none (precompiled binary) |
| install_1 | foo-server |
| install_1_dir | /usr/local/bin/sambamba_0.7.0 |
- view/sort SAM/BAM/CRAM files
Comment: As of 2020.12 at least in some common tasks not faster than samtools with the same level of multithreading.
| Property | value |
|---|---|
| prog_name | BBMap |
| publication | conference: https://www.osti.gov/biblio/1241166 |
| citations_num | 74 (2019.0625) |
| first_release_year | earlier than 2014 |
| www_1 | https://sourceforge.net/projects/bbmap/ |
| www_2 | https://jgi.doe.gov/data-and-tools/bbtools/ |
| repo | ?? |
| lang | java/shell/? |
| obtained_from | https://sourceforge.net/projects/bbmap/files/ |
| installed_version | 38.71 |
| installed_version_date | 2019.10.30 |
| newest_version | 38.90 |
| newest_version_date | 2021.02.03 |
| last_ver_check | 2021.02.07 |
| requirements_1 | java |
| install_1 | foo-server |
| install_1_dir | /opt/soft/bbmap_38.70 |
| install_1_admin | darked |
| install_2 | bar-server |
| install_2_dir | /opt/soft/bbmap_38.67 |
| install_2_admin | darked |
#download on a command line:
curl -sSL "https://sourceforge.net/projects/bbmap/files/BBMap_38.71.tar.gz/download" > BBMap_38.71.tar.gz
tar xfv BBMap_38.71.tar.gz
mv bbmap bbmap_38.71
mv -i bbmap_38.71/ /opt/soft/
cd /opt/soft/
ln -s bbmap_38.71 bbmap_current
- cluster and simplify fastq read names ( clumpify.sh )
#example command
/DATA/darked89/soft/bbmap_current/clumpify.sh \
in=idsc-13p_merged_r1.fq \
in2=idsc-13p_merged_r2.fq \
out=idsc-13p_merged_r1.fq.gz \
out2=idsc-13p_merged_r2.fq.gz \
reorder shortname=shrink
#fish shell
for fn in frombam.r1.fq
/opt/soft/bbmap_38.59/clumpify.sh \
in=$fn \
in2=(basename $fn r1.fq)r2.fq \
out=(basename $fn r1.fq)clump.r1.fq.gz \
out2=(basename $fn r1.fq)clump.r2.fq.gz \
reorder shortname=shrink
end
- count kmers in fastq file(s)
/opt/soft/bbmap_current/kmercountexact.sh \
in=06a_S2_L001_r1.fq \
out=06a_S2_L001_r1.kmercount_bbmap
mincount=10000 \
k=8
# output: 06a_S2_L001_r1.kmercount_bbmap
<snip>
>11009
GAGTTGGT
>19055
GATCTGCT
>10025
GCACTCTT
>45528
GCAGCCTG
<snip>
| Property | value |
|---|---|
| prog_name | IGV |
| publication | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3346182/ |
| citations_num | |
| first_release_year | 2010?? |
| www | http://software.broadinstitute.org/software/igv/home |
| repo | https://github.com/igvteam/igv |
| lang | java |
| obtained_from | https://data.broadinstitute.org/igv/projects/downloads/2.6/IGV_Linux_2.6.3.zip |
| installed_version | 2.6.3 |
| installed_version_date | 2019.08.23 |
| newest_version | 2.9.4 |
| newest_version_date | 2021.03.17 |
| last_ver_check | 2021.03.18 |
| requirements_1 | java |
| install_1 | foo-server |
| install_1_dir | /opt/soft/igv_2.6.3 |
| install_2 | bar-server |
| install_2_dir | /opt/soft/igv_2.6.3 |
| Property | value |
|---|---|
| prog_name | FastQC |
| publication | in press(??) |
| citations_num | |
| first_release_year | ??? |
| www | http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ |
| repo | https://github.com/s-andrews/FastQC |
| lang | java/ |
| obtained_from | http://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.8.zip |
| installed_version | 0.11.8 |
| installed_version_date | 2018.10.04 |
| newest_version | 0.11.9 |
| newest_version_date | 2020.01.08 |
| last_ver_check | 2020.12.29 |
| requirements_1 | java |
| install_1 | foo-server |
| install_1_dir | /opt/soft/fastqc_0.11.8 |
- fastq quality check
| Property | value |
|---|---|
| prog_name | QoRTs |
| publication | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4506620/ |
| citations_num | 80 |
| first_release_year | 2015 |
| www | http://hartleys.github.io/QoRTs/ |
| repo | https://github.com/hartleys/QoRTs |
| lang_1 | java |
| lang_2 | R |
| obtained_from | https://github.com/hartleys/QoRTs/archive/v1.3.6.tar.gz |
| installed_version | 1.3.6 |
| installed_version_date | 2019.03.26 |
| newest_version | 1.3.6 |
| newest_version_date | see above |
| last_ver_check | 2020.12.29 |
| requirements_1 | java (put versions) |
| requirements_2 | R (put versions) |
| install_1 | foo-server |
| install_1_dir | /opt/soft/qorts_1.3.6/ |
| install_1_admin | darked |
| install_2 | bar-server |
| install_2_dir | /opt/soft/qorts_1.3.6/ |
| install_2_admin | darked |
#usage
java -jar /opt/soft/qorts_1.3.6/QoRTs.jar QC \
cbrako-fix039_PT_0_2.rna.mrgd_4.clump.r12.star_hg38p13.bam \
/opt/genome/hg38/gencode.v31.annotation.gtf \
tmp_qual_data/
#more options:
java -jar /opt/soft/qorts_1.3.6/QoRTs.jar --man QC
| Property | value |
|---|---|
| prog_name | multiqc |
| publication | https://academic.oup.com/bioinformatics/article/32/19/3047/2196507 |
| citations_num | 317 (2019.06.24) |
| first_release_year | 2016? |
| www | https://multiqc.info/ |
| repo | https://github.com/ewels/MultiQC |
| lang | python |
| obtained_from | pip |
| installed_version | 1.7 |
| installed_version_date | 2018.12.21 |
| newest_version | 1.10 |
| newest_version_date | 2021.03.08 |
| last_ver_check | 2021.03.18 |
| requirements_1 | pip / python 3 for 1.9 |
| install_1 | foo-server |
| install_1_dir | /usr/local/bin/multiqc |
# to create summary report from i.e. fastqc data for multiple files
multiqc .
| Property | value |
|---|---|
| prog_name | Bedtools |
| publication | in press(??) |
| citations_num | |
| first_release_year | ??? |
| www | ??? |
| repo | https://github.com/arq5x/bedtools2 |
| lang | C++ |
| obtained_from | https://github.com/arq5x/bedtools2/releases/download/v2.28.0/bedtools-2.28.0.tar.gz |
| installed_version | 2.29.0 |
| installed_version_date | 2019.09.03 |
| newest_version | 2.30.0 |
| newest_version_date | 2021.01.23 |
| last_ver_check | 2021.02.07 |
| requirements_1 | g++ /?? |
| docs | https://bedtools.readthedocs.io/en/latest/ |
| tutorial | http://quinlanlab.org/tutorials/bedtools/bedtools.html |
| install_1 | foo-server |
| install_1_dir | /opt/soft/bedtools_2.29.0 |
| install_1_admin | darked |
| install_2 | bar-server |
| install_2_dir | /opt/soft/bedtools_2.29.0 |
| install_2_admin | darked |
| Property | value |
|---|---|
| prog_name | bam-readcount |
| publication | ??? |
| citations_num | ??? |
| first_release_year | 2011 |
| www | ??? |
| repo | https://github.com/genome/bam-readcount |
| lang_1 | C++ |
| obtained_from | https://github.com/genome/bam-readcount/archive/v0.8.0.tar.gz |
| version | 0.8.0 |
| version_date | 2016.10.22 |
| last_ver_check | 2019.06.24 |
| requirements_1 | cmake |
| install_1 | foo-server |
| install_1_dir | /opt/soft/bam-readcount_0.8.0/ |
- stats at a single base resolution for the selected positions
# install
cd bam-readcount_0.8.0
mkdir build
cd build
cmake ..
make
make test
Not developed since 2016
An approximate sequence pattern matcher for FASTQ/FASTA files.
| Property | value |
|---|---|
| prog_name | fqgrep |
| publication | none / https://zenodo.org/record/45105 |
| citations_num | 24? (2019.06.25) |
| first_release_year | 2011 |
| www | |
| repo | https://github.com/indraniel/fqgrep |
| lang | C |
| obtained_from | https://github.com/indraniel/fqgrep/archive/v0.4.4.tar.gz |
| version | 0.4.4 |
| version_date | 2016.01.22 |
| last_ver_check | 2019.06.25 |
| requirements_1 | libtre-dev |
| install_1 | foo-server |
| install_1_dir | /opt/soft/fqgrep_0.4.4/ |
# prerequisites (on Debian)
sudo apt install libtre5 libtre-dev
# 'make' creates the executable.
make
mkdir ./bin
mv -i fqgrep ./bin
# simple search for a given pattern
# searches for TGAAGAGA anywhere in the read, no mismatches, colored output visible in most
fqgrep -c -p 'TGAAGAGA' 06a_S2_L001_r1.fq | most
# search with reporting start/end positions of the pattern, sequence etc.
# the """ grep TGAAGAGA | awk '{print $7}' | sort -n | uniq -c """ part shows starting position distribution/counts
fqgrep -r -p 'TGAAGAGA' 06a_S2_L001_r1.fq | grep TGAAGAGA | awk '{print $7}' | sort -n | uniq -c
# with '-m 2' => two mismatches allowed
fqgrep -r -m2 -p 'TGAAGAGA' 06a_S2_L001_r2.fq | grep TGAAGAGA | most
https://github.com/ngsutils/ngsutils
| Property | value |
|---|---|
| prog_name | bedops |
| publication | https://academic.oup.com/bioinformatics/article/28/14/1919/218826 |
| citations_num | 334 (2019.09.12) |
| first_release_year | 2012? |
| www | https://bedops.readthedocs.io/en/latest/ |
| repo | https://github.com/bedops/bedops |
| lang | C++ |
| obtained_from | https://github.com/bedops/bedops/releases/download/v2.4.36/bedops_linux_x86_64-v2.4.36.tar.bz2 |
| installed_version | 2.4.37 |
| installed_version_date | 2019.05.02 |
| newest_version | 2.4.39 |
| newest_version_date | 2020.04.07 |
| last_ver_check | 2020.12.31 |
| install_1 | foo-server |
| install_1_dir | /opt/soft/bedops_2.4.37/ |
| install_1_admin | darked |
| install_2 | bar-server |
| install_2_dir | /opt/soft/bedops_2.4.36/ |
| install_2_admin | darked |
# distributed as a precompiled binaries
# caution: tar is unpacking to ./bin
#example usage
awk '{ if ($0 ~ "transcript_id") print $0; else print $0" transcript_id \"\";"; }' gencode.v31.annotation.no_head.gtf | gtf2bed - > \
gencode.v31.annotation.no_head.bed
| Property | value |
|---|---|
| prog_name | bam |
| publication | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4448687/ |
| citations_num | 56 (2019.06.17) |
| first_release_year | 2015? |
| www | https://genome.sph.umich.edu/wiki/BamUtil |
| repo | https://github.com/statgen/bamUtil |
| lang | C++ |
| obtained_from | see repo |
| version | 0.8.0 (??) |
| version_date | 2019.04.20 |
| last_ver_check | 2019.09.04 |
| requirements_1 | libStatGen |
| requirements_1_repo | https://github.com/statgen/libStatGen |
| install_1 | foo-server |
| install_1_dir | /opt/soft/bamutil_20190617/ |
| install_2 | bar-server |
| install_2_dir | /opt/soft/bamutil_20190904/ |
git clone git://github.com/statgen/bamUtil.git
git clone git://github.com/statgen/libStatGen.git
mv bamUtil/ bamutil_20190617
cd bamutil_20190617
make all
#not very informative:
make test
#it runs but does not report neither test passing nor errors
bin/bam stats --in /mnt/vdb1/darked89/proj/mongra_20190506/BWA_bam/6_S1_L001_r12.bwa.bam --basic
Number of records read = 6172826
TotalReads(e6) 6.17
MappedReads(e6) 6.15
PairedReads(e6) 6.17
ProperPair(e6) 6.09
DuplicateReads(e6) 0.00
QCFailureReads(e6) 0.00
MappingRate(%) 99.67
PairedReads(%) 100.00
ProperPair(%) 98.65
DupRate(%) 0.00
QCFailRate(%) 0.00
TotalBases(e6) 468.77
BasesInMappedReads(e6) 467.23
#!/usr/bin/fish
for fn in *bam
/opt/soft/bamutil_current/bin/bam dedup --in $fn --out (basename $fn .bam).md.bam --force --oneChrom --verbose
end
| Property | value |
|---|---|
| prog_name | vcfanno |
| publication | https://www.ncbi.nlm.nih.gov/pubmed/19505943 |
| citations_num | 18873 (2019.05.07) |
| first_release_year | 2009? |
| www | http://www.htslib.org/ |
| repo | https://github.com/brentp/vcfanno |
| lang | Go |
| obtained_from | https://github.com/brentp/vcfanno/releases/download/v0.3.1/vcfanno_linux64 |
| installed_version | 0.3.1 |
| installed_version_date | 2018.10.29 |
| newest_version | 0.3.2 |
| newest_version_date | 2019.07.30 |
| last_ver_check | 2020.12.31 |
| requirements_1 | ?? Lua ?? |
| install_1 | foo-server |
| install_1_dir | /opt/soft/vcfanno_0.3.1 |
- primary use:
vcfanno allows you to quickly annotate your VCF with any number of INFO fields from any number of VCFs or BED files.
- status: not tested
| Property | value |
|---|---|
| prog_name | jellyfish |
| publication | https://academic.oup.com/bioinformatics/article/27/6/764/234905 |
| citations_num | 999 (2019.06.25) |
| first_release_year | 2011 |
| www | http://www.genome.umd.edu/jellyfish.html |
| repo | https://github.com/gmarcais/Jellyfish |
| lang | C++ |
| obtained_from | https://github.com/gmarcais/Jellyfish/releases/download/v2.2.10/jellyfish-2.2.10.tar.gz |
| installed_version | 2.2.10 |
| installed_version_date | 2018.05.01 |
| newest_version | 2.3.0 |
| newest_version_date | 2019.07.13 |
| last_ver_check | 2020.12.31 |
| requirements_1 | ?? |
| install_1 | foo-server |
| install_1_dir | /opt/soft/jellyfish_2.2.10 |
# install from source
autoreconf -i
./configure --prefix=/opt/soft
make
make check
make install
# test run
jellyfish bc -m 8 -s 10G -t 16 -o 06a_S2_L001_r12.bc 06a_S2_L001_r1.fq 06a_S2_L001_r2.fq
jellyfish count -m 8 -s 3G -t 16 --bc 06a_S2_L001_r12.bc 06a_S2_L001_r1.fq 06a_S2_L001_r2.fq
# this creates a default mer_counts.jf file
ncbi SRA Tools: https://github.com/ncbi/sra-tools
| Property | value |
|---|---|
| prog_name | sratoolkit |
| publication | ? |
| citations_num | ? |
| first_release_year | 2011 |
| wiki | https://github.com/ncbi/sra-tools/wiki |
| repo | https://github.com/ncbi/sra-tools |
| lang | C++ |
| obtained_from | https://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/2.10.7/sratoolkit.2.10.7-ubuntu64.tar.gz |
| installed_version | 2.10.7 |
| installed_version_date | 2020.05.27 |
| newest_version | 2.10.9 |
| newest_version_date | 2020.12.16 |
| last_ver_check | 2020.12.31 |
| requirements_1 | ?? |
| install_1 | vagrant_deb_buster |
| install_1_dir |
Works faster using Aspera Client or rather aspera cli software. Get it from: https://downloads.asperasoft.com/
# to get the particular run:
prefetch SRR5272532