Skip to content

Commit a565a95

Browse files
committed
Update README.md
1 parent e5e4862 commit a565a95

File tree

1 file changed

+2
-22
lines changed

1 file changed

+2
-22
lines changed

README.md

Lines changed: 2 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,8 @@ Additional outputs include PDFs of SNP phylogeny (ML tree generated with IQ-Tree
2929
##### Sub-wf 4: Cluster SNP barcoding [WIP]
3030
This is an experimental aspect of the workflow that aims to begin characterizing individual SNPs that are designated uniquely to a particular 5 SNP pairwise distance cluster (`--distance 5` in MTBseq). The plan with this sub-wf is to quickly identifying which genomic cluster a particular genome may belong to prior to SNP clustering with the goal of reducing computational resources and speeding up the analysis. All genomes as part of the sub-wf 1 will have their SNP profiles compared to the cluster barcode SNPs and pre allocated a preliminary cluster for clustering in sub-wf 2.
3131

32-
In this workflow, all genomes SNP profiles merged into a single VCF (grouped by lineage), and the SNP profiles of genomes belonging to the same cluster are compared to all other genomes within the same lineage, to calculate the F~TS~ value (fixation index) for each SNP within the cluster population. SNPs that fulfill the following criteria are classified as a cluster specific SNP:
33-
- F~TS~ = 1
32+
In this workflow, all genomes SNP profiles merged into a single VCF (grouped by lineage), and the SNP profiles of genomes belonging to the same cluster are compared to all other genomes within the same lineage, to calculate the Fts value (fixation index) for each SNP within the cluster population. SNPs that fulfill the following criteria are classified as a cluster specific SNP:
33+
- Fts = 1
3434
- Minimum of 20 reads in both strands (20X cov)
3535
- Minimum quality of 20
3636
- Not annotated as: *PE/PPE/PGRS*; *maturase*; *phage*; or *13E12 repeat family protein*
@@ -134,23 +134,3 @@ qsub -S /bin/bash -cwd -V -N nf-main \
134134
-profile igtp,conda_on
135135
# this specifies that the job should be submitted to the IGTP HPC using conda
136136
```
137-
138-
139-
# <u>To be done</u>
140-
- **Negative contorl wf**
141-
- currently commented out the compile NC results as there is a clash in the input tuples.
142-
143-
- **Summary workflow**
144-
- Write HTML Rmarkdown results for the summary workflow
145-
- Create datbase summary figures:
146-
- number of genomes
147-
- distribution of lineages
148-
- number of clusters
149-
- identification of new clsuters
150-
- expanding clusters
151-
- merging clusters (at higher SNP levels)
152-
- position of SNPs relative to clusters
153-
- Create a excel WB for each matrix (lineage - improve access for Vero/Elisa).
154-
-
155-
- **Manual modification app**
156-
- continue working on it

0 commit comments

Comments
 (0)