-
Notifications
You must be signed in to change notification settings - Fork 5
Installation
metabolisHMM was built to run on Unix machines (Ubuntu Linux or Mac OS) and runs as a series of command line programs. This tool has several dependencies and packages that must be installed before using the various workflows.
You can install metabolisHMM using pip. Note that the correct way to install pip packages is to preprend with python3 -m or using pip3, since this package was written in Python 3.
python3 -m pip install metabolisHMM
If you do not have the correct version of pip3 or don't like installing packages with pip, you can also clone the repository locally, and run the workflows from within the directory, or manually run the setup.py script to add the workflows to your path:
git clone https://github.com/elizabethmcd/metabolisHMM
cd metabolisHMM
python3 setup.py install # automatically adds python workflows to your python path
If you use the workflows by just cloning the directory and not adding them to your path as a package, you will need to prepend either python or ./ to the beginning of the workflow. Additionally, to make the heatmap figures, you will need to download and have in hand the make-heatmap.R script. You can either get that by cloning the directory, or it is also included in the metabolisHMM_v4.0_exec/ auxiliary files of curated markers.
To download the curated markers and auxiliary script, download this tarball:
wget https://github.com/elizabethmcd/metabolisHMM/releases/download/v2.0/metabolisHMM_v4.0_ext.tgz
The following dependencies will also need to be downloaded and either placed in a single environment or added to your path using conda, or manually installed and adding to your path.
Depending on your experience, you may really love or really hate the Anaconda python distribution for using python, corresponding packages, and/or external programs. To use metabolisHMM, all external programs, python packages, and base R + packages for visualization purposes can be installed with Anaconda. If you don't have an Anaconda distribution of python on your machine, visit the Anaconda site and download the correct distribution for your machine, and follow their online instructions. metabolisHMM specifically requires python 3.6 or above, so be sure to download the latest release of Anaconda3. After you have installed python through Anaconda and this distribution of python is added in your path, run the following:
conda create -n metabolishmm -c conda-forge -c bioconda -c defaults biopython hmmer=3.2.1 mafft=7.222 fasttree=2.1.10 prodigal=2.6.3 raxml r r-essentials
You should then be able to use the metabolishmm environment (created with no caps for ease of typing, but admittedly looks funny) with conda activate metabolishmm and run the workflows within the metabolisHMM repository. The great thing about installing packages using conda and putting them into an environment is that they will be independent of any other programs or versions on your machine, and are therefore contained to exactly what metabolisHMM needs to run. These conda installation steps should work on a Mac or Ubuntu Unix-based machine.
These steps have only been confirmed on an Ubuntu Linux machine. For use on a Mac for manual installation, you can replace some of the sudo apt-get install steps with manual tarball downloads and installation instructions for each package, or install using homebrew. Prodigal, HMMER, MAFFT, FastTree, and RAxML are all either part of the main homebrew distribution or the homebrew-bio package manager.
# prodigal
wget https://github.com/hyattpd/Prodigal/archive/v2.6.3.tar.gz
tar -xzf v2.6.3.tar.gz
cd Prodigal-2.6.3
make install # puts in /usr/local/bin/
# HMMER
wget http://eddylab.org/software/hmmer/hmmer.tar.gz
tar zxf hmmer.tar.gz
cd hmmer-3.2.1
./configure --prefix /your/install/path
make
make install
# MAFFT
wget https://mafft.cbrc.jp/alignment/software/mafft-7.450-gcc_fc6.x86_64.rpm
rpm -Uvh mafft-x.xx-xxx.xxx.rpm
exit
# for mac download the .pkg file and run to automatically add mafft to the command line
# can also be installed with apt-get if you have sudo privileges
sudo apt-get install mafft
# FastTree
wget http://www.microbesonline.org/fasttree/FastTree.c
gcc -DNO_SSE -O3 -finline-functions -funroll-loops -Wall -o FastTree FastTree.c -lm
sudo mv FastTree /usr/local/bin
# RAxML
git clone https://github.com/stamatak/standard-RAxML
cd standard-RAxML
make -f Makefile.PTHREADS.gcc
# you will need the PTHREADS version, specifically raxmlHPC-PTHREADS in your path to run the raxml option for making phylogenies
If you choose to use a python distribution outside of Anaconda, you will need to download python packages using pip:
pip install biopython
pip install pandas
If you do not have R/RStudio installed, you will need to install base R and the tidyverse and reshape2 packages:
sudo apt-get install build-essential libcurl4-gnutls-dev libxml2-dev libssl-dev
# within R
R
install.packages(c('tidyverse','reshape2', 'gridExtra'))
q()
To install R/RStudio and the corresponding packages on your own desktop (Mac for example) follow the R CRAN instructions to get a local distribution of base R, and either within base R or RStudio, run the command `install.packages(c('tidyverse','reshape2')).