A Snakemake workflow for TotalRNA analysis from the Department of Environmental Science of Aarhus University.
conda activate snakemake
git clone https://github.com/AU-ENVS-Bioinformatics/TotalRNA-Snakemake
cd TotalRNA-Snakemake
snakemake -c1 skip_rename # or snakemake -n rename
snakemake -c100 --use-conda --keep-goingThis pipeline manages large-scale TotalRNA meta-transcriptomic data for taxonomic analyses of SSU reads and mRNA ANALYSIS. The steps involved are:
- Trim reads using trim-galore.
- Filtering SSU and LSU reads using sormerna and SILVA.
- Reconstructing ribosomal genes using Metarib.
- Checking the quality of the ribosomal assembly using QUAST.
- Mapping RNA contigs to reads using BWA and samtools.
- Classifying reads taxonomically using BLAST, SILVA and CREST.
- Assembling non-rRNA reads (Trinity) and filtering noncoding RNA using the RFam database.
- Mapping mRNA contigs to reads using BWA and samtools.
- Functional (best-hit) and taxonomic (LCA) annotation of mRNA contigs using Diamond and AnnoTree, which includes KEGG, Pfam and Tigrfam annotations for over 30,000 bacterial and 1600 archaeal genomes.
Check the Wiki of the project for more information.
It is best to pre-install Mamba before starting. All other dependencies will be installed automatically when running the pipeline for the first time.
conda activate base
mamba create -c conda-forge -c bioconda -n snakemake snakemakeActivating conda environment:
conda activate snakemakeClone this git repository to the location where you want to run your analysis.
git clone https://github.com/AU-ENVS-Bioinformatics/TotalRNA-Snakemake TotalRNA-Snakemake-Project
cd TotalRNA-Snakemake-ProjectCopy or symlink raw fastq files into the ´reads´ directory. See reads/README.md for more information. Now, we are going to rename those files and made symlinks to the results/renamed directory. To skip this step, just copy your files into results/renamed and skip the next step. Alternatively, you can run snakemake -c1 skip_rename to symlink your files without renaming them.
snakemake -n rename
snakemake -c1 renameCheck that all your samples are in results/renamed:
ls results/renamed_raw_reads/Check that the pipeline will behave as expected by running a dry run and check the configuration file if not.
snakemake -n --use-condaFinally, run the whole pipeline. A useful flag to add is --keep-going to prevent the pipeline to stop if an error occurs. If you are running this in a shared environment, you can have all the conda environments in a shared location by adding --conda-prefix /path/to/shared/conda/envs.
snakemake -c100 --use-conda --keep-goingYou should consider re-running the AnnoTree notebook with custom parameters interactively (notebook/annotree.ipynb)
Please find more information in the Wiki of the project.