Skip to content

Commit b311536

Browse files
authored
Update README (#38)
* Bump version, update kraken db * Add details on additional analyses & outputs
1 parent 266d1e8 commit b311536

File tree

2 files changed

+25
-7
lines changed

2 files changed

+25
-7
lines changed

README.md

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,9 @@ A generic pipeline that can be run routinely on all Illumina sequence runs, rega
1111

1212
* Parse run-level QC statistics from the 'InterOp' directory and write to `.csv` and `.json` format.
1313
* [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/): sample-level sequence quality metrics
14-
* [Kraken2](https://github.com/DerrickWood/kraken2) + [Bracken](https://github.com/jenniferlu717/Bracken): Taxonomic classification
14+
* [`mash`](https://github.com/marbl/Mash): Estimate genome size and depth of coverage
15+
* [`seqtk fqchk`](https://github.com/lh3/seqtk): Measure average sequence quality score, percent of bases above Q30 and GC content.
16+
* [`kraken2`](https://github.com/DerrickWood/kraken2) + [`bracken`](https://github.com/jenniferlu717/Bracken): Taxonomic classification
1517
of reads. Estimation of relative abundances of taxonomic groups (genus, species) in each sample.
1618
* [MultiQC](https://github.com/ewels/MultiQC): Collect several QC metrics into a single interactive HTML report.
1719

@@ -22,6 +24,9 @@ nextflow run BCCDC-PHL/routine-sequence-qc \
2224
[--instrument_type nextseq] \
2325
[--kraken2_db /path/to/kraken2_db] \
2426
[--bracken_db /path/to/bracken_db] \
27+
[--seqtk_fqchk_threshold <threshold>] \
28+
[--mash_sketch_kmer_size <kmer_size>] \
29+
[--mash_sketch_minimum_copies <copies>] \
2530
--run_dir <your illumina run directory> \
2631
--outdir <output directory>
2732
```
@@ -33,6 +38,8 @@ nextflow run BCCDC-PHL/routine-sequence-qc \
3338
├── abundance_top_n
3439
│   ├── top_3_abundances_genus.csv
3540
│   └── top_5_abundances_species.csv
41+
├── basic_qc_stats
42+
│   └── basic_qc_stats.csv
3643
├── bracken
3744
│   ├── <sample_id>_Genus_bracken_abundances.tsv
3845
│   ├── <sample_id>_Genus_bracken.txt
@@ -45,13 +52,24 @@ nextflow run BCCDC-PHL/routine-sequence-qc \
4552
│   ├── ...
4653
├── interop_summary
4754
│   ├── interop_index-summary.csv
48-
│   └── interop_summary.csv
55+
│   ├── interop_summary.csv
56+
│   └── interop_summary.json
4957
├── kraken2
5058
│   ├── <sample_id>_kraken2.txt
5159
│   ├── ...
60+
├── mash_sketch
61+
│   ├── <sample_id>_R1_mash_sketch.txt
62+
├── mash_sketch_summary
63+
│   └── mash_sketch_summary.csv
5264
├── multiqc
5365
│   ├── multiqc_data
5466
│   └── multiqc_report.html
5567
└── parse_sample_sheet
56-
└── sample_sheet.json
68+
│   └── sample_sheet.json
69+
├── pipeline_complete.json
70+
├── seqtk_fqchk
71+
│   ├── <sample_id>_seqtk_fqchk_all_positions.csv
72+
│   ├── <sample_id>_seqtk_fqchk_by_position.csv
73+
└── seqtk_fqchk_summary
74+
└── seqtk_fqchk_summary.csv
5775
```

nextflow.config

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
params {
2-
kraken2_db = "/data/ref_databases/kraken2/2020-12-02_standard"
3-
bracken_db = "/data/ref_databases/bracken/2020-12-02_standard"
2+
kraken2_db = "/data/ref_databases/kraken2/2021-05-17_standard"
3+
bracken_db = "/data/ref_databases/bracken/2021-05-17_standard"
44
instrument_type = "miseq"
55
seqtk_fqchk_threshold = 30
66
mash_sketch_kmer_size = 21
@@ -28,10 +28,10 @@ process {
2828

2929

3030
manifest {
31-
author = 'Dan Fornika'
31+
author = 'Dan Fornika, Nima Farzaneh'
3232
description = 'Routine Sequence QC'
3333
mainScript = 'main.nf'
3434
nextflowVersion = '>=20.01.0'
35-
version = '0.2.3'
35+
version = '0.3.0'
3636
}
3737

0 commit comments

Comments
 (0)