File cadd_anno_header: File containing VCF header lines describing the annotations.- See the VCF 4.2 specification for full details.
- For example, one line could be:
##INFO=<ID=SIFTval,Number=1,Type=String,Description="SIFT score"> - It is best to set
Type=Stringto be robust to missing values which are coded in unpredictable formats.
File cadd_cache: CADD annotations in tabular format.- For example, this file of CADD v1.6 annotations which can be downloaded from https://cadd.gs.washington.edu.
File cadd_cache_idx: Tabix index (.tbi) file forcadd_cache.File cadd_cols2keep: File indicating which columns of the . Seebcftools annotate -Cdocumentation for full details, but briefly:- Columns must be listed in order they appear in
cadd_cache. - Columns representing chromosome, position, reference and alternate alleles must be labelled
CHROM,POS,REF,ALT. - Columns to drop are listed as
-. Columns to keep are given a name.
- Columns must be listed in order they appear in
File chr_rename_file: A file with two columns of chromosome codes: one of the chromosome names in yourvcfs, and the other with chromosomes named as1,2,...22,X.- This is used to make
vcfswhich have the chromosome naming schemechr1,chr2... etc. compatible with thecadd_cache`.
- This is used to make
File chr_unrename_file: Similar tochr_rename_file, but maps the chromosome codes back to how they were before.File gerp_bw: BigWig (.bw) file of GERP scores downloadable here (used by VEP's loftee plugin).File human_ancestor_seq: Human ancestor sequence file downloadable here (used by VEP's loftee plugin).File phylocsf_db: SQL database of PhyloCSF metrics downloadable here (used by VEP's loftee plugin).File phylop100_bw: BigWig (.bw) file of phyloP100way scores, downloadable from UCSC here.- These scores represent the degree to which variants are conserved in a collection of 100 non-human vertebrate species. For more information, see this page of the UCSC Genome Browser site.
Array[File] vcfs: VCF (or BCF) file(s) to be annotated.- The files must contain INFO/AC and INFO/AN fields at minimum.
File vep_cache: v115 of the cache file for Ensembl's Variant Effect Predictor (VEP), (downloadable here).- (Optional)
File filter_regions: File of regions to filter thevcfsby, one region per line.- See
bcftoolsdocumentation about the-Roption for full details.
- See
- (Optional)
File filter_samples: File of sample ids to filter thevcfsby, one id per line.- See
bcftoolsdocumentation about the-Soption for full details.
- See
- (Optional)
Int n_cpu: Number of cores to allocate. More cores will make the workflow finish more quickly, but also cost slightly more.- For example, a run that took 3hr:45min and $1.75 on 8 cores, took 1hr:30min and $2.73 on 32 cores.
schatzlab/Watershed-SNV-WDL
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|