Configuration_files

Configuration

Two configuration files are required for this pipeline:

- `config.yaml` contains command line arguments, reference paths, and system options. This is a [yaml](https://en.wikipedia.org/wiki/YAML) file.
- `design.tsv` contains sample's identifiers and paths.

We suggest that you use provided script to build configuration files, and then modify them if needed. Most of the time, this scripts will be enough for you. Just look at:

- `prepare_pipeline.py`

However, if you want to, you can build them manually: every single part of these files are described below.

Automatic configuration building

`prepare_pipeline.py`

The script prepare_pipeline.py is your friend during the fastidious step of pipeline customization: it builds both config file and design file. By default, this script will not overwrite any existing files.

Your can test the prepare_pipeline.py by running make all-unit-test. See the section of this documentation that is related to "Testing" for more information.

You may have all possible arguments of the script prepare_pipeline.py with its argument --help:

# Activate conda environment
conda activate vcf-annotate-snpeff-snpsift

# Read help
python3.8 prepare_pipeline.py --help

Please, find below running examples:

# In case I want all default parameters, and my VCF files are in vcf_dir:
python3.8 vcf_dir path/GWASCat.tsv path/GeneSets.gmt path/dbNSFP.tsv

# Same case as above, but
# - I want snpeff not to run with pre-installed genomes
# - I wans to search recursively in vcf_dir for VCF files
python3.8 vcf_dir \
          path/GWASCat.tsv \
          path/GeneSets.gmt \
          path/dbNSFP.tsv \
          --snpeff-extra '-no-genome'
          --recursive

Detailed content of the `config.yaml`

This is a yaml file. The following keys are required (in any order):

# As simple key: value
design: /path/to/design_file.tsv (string)
workdir: /path/to/workdir (string)
threads: maximum number of threads (integer)
singularity_docker_image: name of a docker/singularity image (string)
# As key: list of values
cold_storage:
  - /path/to/cold_storage1 (string)
  - /path/to/cold_storage2 (string)
  ...
# As nested key: key: value
ref:
  GWASCat: /path/to/gwascat.tsv
  GeneSets: /path/to/GeneSets.gmt
  dbNSFP: /path/to/dbNSFP.tsv
params:
  snpeff_extra: Extra parameters (string) for SnpEff
  snpsift_varType_extra: Extra parameters (string) for SnpSift
  snpsift_GWASCat_extra: Extra parameters (string) for Snpsift
  snpsift_GeneSets_extra: Extra parameters (string) for Snpsift
  snpsift_dbNSFP_extra: Extra parameters (string) for Snpsift
workflow
  multiqc: weather to run multiqc or not (boolean)

A complete config.yaml file would look like this:

design: design.tsv
workdir: .
threads: 1
singularity_docker_image: docker://continuumio/miniconda3:4.4.10
cold_storage:
  - /media
ref:
  GWASCat: /path/to/gwascat.tsv
  GeneSets: /path/to/GeneSets.gmt
  dbNSFP: /path/to/dbNSFP.tsv
workflow:
  multiqc: true
params:
  copy_extra: --parents --verbose
  snpeff_extra: -v
  snpsift_varType_extra: ""
  snpsift_GWASCat_extra: ""
  snpsift_GeneSets_extra: ""
  snpsift_dbNSFP_extra: "-v"

Detailed content of the `design.tsv`

This is a TSV file describing our analysis. The column order is not relevant. If you want to build it manually, use your favorite tabular-file editor.

It must contain the following columns:

* Sample_id: the name of each samples
* VCF_File: path to the upstream VCF file

The optional columns are:

* VCF_Index: path to tbi-indexed files
* Any other information

An paired-end miniamal-example would be:

Sample_id	VCF_File
Sample 1	/path/to/file1.vcf
Sample 2	/path/to/file2.vcf

Typos corrections and issues are welcomed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Configuration_files

Configuration

Automatic configuration building

`prepare_pipeline.py`

Detailed content of the `config.yaml`

Detailed content of the `design.tsv`

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Configuration_files

Configuration

Automatic configuration building

prepare_pipeline.py

Detailed content of the config.yaml

Detailed content of the design.tsv

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

`prepare_pipeline.py`

Detailed content of the `config.yaml`

Detailed content of the `design.tsv`