Snakemake workflow: NCF1_variant_calling

This Snakemake pipeline implements the GATK best-practices workflow for calling small genomic variants.

This workflow is adapted from this Snakemake pipeline: https://github.com/snakemake-workflows/dna-seq-gatk-variant-calling

Many updates were made to the pipeline for calling variants in duplicated regions, like the NCF1 gene.

Authors

Eric Karlins

Usage

You'll first need to create a fasta file with duplicated regions that match your region of interest masked with Ns.

Step 1: Create masked reference fasta file

For NCF1, using the hg38 reference fasta, I created a bed file with two lines to mask the regions of NCF1b and NCF1c. This bed file can be found in resources/NCF1_region_to_mask.bed
Next I used bedtools maskfasta to create the masked reference file. The command was:

bedtools maskfasta -fi Homo_sapiens_assembly38_plus.fasta -bed NCF1_region_to_mask.bed -fo Homo_sapiens_assembly38_plus_NCF1_mask.fasta

Create the bwa index files for your new fasta reference:

bwa index Homo_sapiens_assembly38_plus_NCF1_mask.fasta

Step 2: Obtain a copy of this workflow

Clone the newly created repository to your local system, into the place where you want to perform the data analysis.

Step 3: Configure workflow

Configure the workflow according to your needs via editing the file config.yaml.

Step 4: Execute workflow

Test your configuration by performing a dry-run via

snakemake --use-conda -n

Execute the workflow locally via

snakemake --use-conda --cores $N

using $N cores or run it in a cluster environment via

snakemake --use-conda --cluster qsub --jobs 100

or

snakemake --use-conda --drmaa --jobs 100

If you not only want to fix the software stack but also the underlying OS, use

snakemake --use-conda --use-singularity

in combination with any of the modes above. See the Snakemake documentation for further details.

Name		Name	Last commit message	Last commit date
Latest commit History 158 Commits
.github/workflows		.github/workflows
ClusterLogs		ClusterLogs
envs		envs
report		report
resources		resources
rules		rules
schemas		schemas
scripts		scripts
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
Snakefile		Snakefile
cluster.yaml		cluster.yaml
config.yaml		config.yaml
samples.tsv		samples.tsv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Snakemake workflow: NCF1_variant_calling

Authors

Usage

Step 1: Create masked reference fasta file

Step 2: Obtain a copy of this workflow

Step 3: Configure workflow

Step 4: Execute workflow

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

niaid/NCF1_variant_calling

Folders and files

Latest commit

History

Repository files navigation

Snakemake workflow: NCF1_variant_calling

Authors

Usage

Step 1: Create masked reference fasta file

Step 2: Obtain a copy of this workflow

Step 3: Configure workflow

Step 4: Execute workflow

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages