-
Notifications
You must be signed in to change notification settings - Fork 2
Home
Josh Loecker edited this page Sep 21, 2021
·
15 revisions
This section is partially completed, please check back later
This is a snakemake workflow that aims to do several things, using as much parallelization as possible:
- Given a CSV file containing: SRR Codes, a target output name, and Paired End or Single End reads
- Generate genome files using STAR
- Download each SRR code in parallel using prefetch
- Unpack the
.srafiles using parallel-fastq-dump, generating.fastq.gzfiles - Optionally trim the resulting
.fastq.gzfiles (using Trim Galore) - Perform FastQC on the parallel-fastq-dump files, and optionally on the resulting trimmed files
- Perform STAR align on files from parallel-fastq-dump (or trim) files to the generated genome files
- Perform MultiQC, using the files from parallel-fastq-dump, FastQC, and STAR algner
Ultimately, the results of this project will be a series of .fastq.gz files available, along with reports from FastQC and MultiQC. The .fastq.gz files may be used in further analysis
- (Getting started)[Getting-Started]
- Downloading
- Installing
- Running
Created by Josh Loecker and Brandt Bessell