@@ -11,7 +11,9 @@ A generic pipeline that can be run routinely on all Illumina sequence runs, rega
1111
1212* Parse run-level QC statistics from the 'InterOp' directory and write to ` .csv ` and ` .json ` format.
1313* [ FastQC] ( https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ ) : sample-level sequence quality metrics
14- * [ Kraken2] ( https://github.com/DerrickWood/kraken2 ) + [ Bracken] ( https://github.com/jenniferlu717/Bracken ) : Taxonomic classification
14+ * [ ` mash ` ] ( https://github.com/marbl/Mash ) : Estimate genome size and depth of coverage
15+ * [ ` seqtk fqchk ` ] ( https://github.com/lh3/seqtk ) : Measure average sequence quality score, percent of bases above Q30 and GC content.
16+ * [ ` kraken2 ` ] ( https://github.com/DerrickWood/kraken2 ) + [ ` bracken ` ] ( https://github.com/jenniferlu717/Bracken ) : Taxonomic classification
1517of reads. Estimation of relative abundances of taxonomic groups (genus, species) in each sample.
1618* [ MultiQC] ( https://github.com/ewels/MultiQC ) : Collect several QC metrics into a single interactive HTML report.
1719
@@ -22,6 +24,9 @@ nextflow run BCCDC-PHL/routine-sequence-qc \
2224 [--instrument_type nextseq] \
2325 [--kraken2_db /path/to/kraken2_db] \
2426 [--bracken_db /path/to/bracken_db] \
27+ [--seqtk_fqchk_threshold <threshold>] \
28+ [--mash_sketch_kmer_size <kmer_size>] \
29+ [--mash_sketch_minimum_copies <copies>] \
2530 --run_dir <your illumina run directory> \
2631 --outdir <output directory>
2732```
@@ -33,6 +38,8 @@ nextflow run BCCDC-PHL/routine-sequence-qc \
3338├── abundance_top_n
3439│ ├── top_3_abundances_genus.csv
3540│ └── top_5_abundances_species.csv
41+ ├── basic_qc_stats
42+ │ └── basic_qc_stats.csv
3643├── bracken
3744│ ├── <sample_id>_Genus_bracken_abundances.tsv
3845│ ├── <sample_id>_Genus_bracken.txt
@@ -45,13 +52,24 @@ nextflow run BCCDC-PHL/routine-sequence-qc \
4552│ ├── ...
4653├── interop_summary
4754│ ├── interop_index-summary.csv
48- │ └── interop_summary.csv
55+ │ ├── interop_summary.csv
56+ │ └── interop_summary.json
4957├── kraken2
5058│ ├── <sample_id>_kraken2.txt
5159│ ├── ...
60+ ├── mash_sketch
61+ │ ├── <sample_id>_R1_mash_sketch.txt
62+ ├── mash_sketch_summary
63+ │ └── mash_sketch_summary.csv
5264├── multiqc
5365│ ├── multiqc_data
5466│ └── multiqc_report.html
5567└── parse_sample_sheet
56- └── sample_sheet.json
68+ │ └── sample_sheet.json
69+ ├── pipeline_complete.json
70+ ├── seqtk_fqchk
71+ │ ├── <sample_id>_seqtk_fqchk_all_positions.csv
72+ │ ├── <sample_id>_seqtk_fqchk_by_position.csv
73+ └── seqtk_fqchk_summary
74+ └── seqtk_fqchk_summary.csv
5775```
0 commit comments