vSNP

Reference based SNP calling. Workflow used to continually add samples to a dataset, and organize that data in way that allows for SNP validation.

Pros

high resolution SNP analysis
confidence in SNP calls
visualize SNP differences in tables
workflow provides a predictable and familiar data structure
handles large datasets

Cons

reference based
time intensive
- reference setup
- data validation
subjective SNP filtering

Installation

vSNP can be installed via Anaconda.

If Anaconda is not installed follow steps at package_manager setup.

Follow setup and testing at vSNP3 GitHub page.

Dependency files

reference.fasta
define_filter.xlsx
metadata.xlsx

Output

Organized by reference type
Step 1 - alignments
Step 2 - VCF collection

Install

conda create -c conda-forge -c bioconda -n vsnp3 vsnp3=3.24

Download test files

cd ~; git clone https://github.com/USDA-VS/vsnp3_test_dataset.git

Add reference:

cd ~/vsnp3_test_dataset/vsnp_dependencies

vsnp3_path_adder.py -d `pwd`

Run

vsnp3_path_adder.py -s

vsnp3_step1.py -r1 ERR766214_R1.fastq.gz -r2 ERR766214_R2.fastq.gz -t Mycobacterium_AF2122

Look over stats

Add VCF to database and run step 2

vsnp3_step2.py -t Mycobacterium_AF2122 -a

Run multiple

BCG samples

ERR766216
ERR766219
ERR766220
ERR766213
ERR766225
ERR766224
SRR398629
ERR766223
ERR234151
SRR7983756
ERR017778
ERR766218
ERR766215
ERR766217
ERR766214
ERR766222
ERR766221
ERR766226

Package FASTQs

for fastq in *.fastq.gz; do name=$(echo $fastq | sed 's/[._].*//'); mkdir -p $name; mv -v $fastq $name/; done

Loop directories

NUM_PER_CYCLE=4; starting_dir=$(pwd); for dir in ./*/; do (echo "starting: $dir"; cd ./$dir; vsnp3_step1.py -r1 *_R1*.fastq.gz -r2 *_R2*.fastq.gz; cd $starting_dir) & let count+=1; [[ $((count%NUM_PER_CYCLE)) -eq 0 ]] && wait; done

Collect Stats

mkdir stats; cp ./*/*stats.xlsx stats; cd stats; vsnp3_excel_merge_files.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vSNP

Pros

Cons

Installation

Dependency files

Output

Install

Download test files

Add reference:

Run

Run multiple

HOME

FilesExpand file tree

vsnp.md

Latest commit

History

vsnp.md

File metadata and controls

vSNP

Pros

Cons

Installation

Dependency files

Output

Install

Download test files

Add reference:

Run

Run multiple

HOME