TRON-Bioinformatics/qualibam is a bioinformatics pipeline for comprehensive post-alignment quality control of RNA-seq and DNA-seq BAM files. It runs a battery of QC tools — Picard MarkDuplicates, Qualimap BAMQC, Somalier, RustQC, RNA-SeQC, and MuSiC deconvolution — on coordinate-sorted BAM files and aggregates all results into a single interactive MultiQC report.
The pipeline executes the following steps. Steps marked optional are skipped when the corresponding --skip_* flag is set:
- Mark duplicate reads (
Picard MarkDuplicates) — optional,--skip_markduplicates - Alignment QC on all samples, RNA and DNA (
Qualimap BAMQC) — optional,--skip_qualimap - Sample-swap and relatedness detection (
Somalier) — optional,--skip_somalier, requires--fastaand--somalier_sites - All-in-one RNA-seq QC on RNA samples — dupradar, featureCounts, Preseq, RSeQC, Qualimap RNA-seq (
RustQC) — optional,--skip_rustqc - RNA-seq QC metrics on RNA samples (
RNA-SeQC) — optional,--skip_rnaseqc, requires--gtf(collapsed GTF auto-generated; override with--rnaseqc_gtf) - Cell-type deconvolution of RNA bulk data (
MuSiC) — optional,--skip_music, requires--sc_referenceand RustQC featureCounts; nested inside--skip_rustqc - Aggregate QC report (
MultiQC)
Note
If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.
First, prepare a samplesheet with your input data that looks as follows:
samplesheet.csv:
sample,type,bam,bai
SAMPLE_RNA1,RNA,/path/to/SAMPLE_RNA1.bam,/path/to/SAMPLE_RNA1.bam.bai
SAMPLE_DNA1,DNA,/path/to/SAMPLE_DNA1.bam,/path/to/SAMPLE_DNA1.bam.baiEach row represents one sample. The type column must be RNA or DNA — RNA-specific tools (RustQC, RNA-SeQC, MuSiC) are only run on samples with type = RNA. See docs/usage.md for the full samplesheet specification.
Now, you can run the pipeline using:
nextflow run TRON-Bioinformatics/qualibam \
-profile <docker/singularity/.../institute> \
--input samplesheet.csv \
--gtf /path/to/genes.gtf \
--fasta /path/to/genome.fa \
--somalier_sites /path/to/sites.vcf.gz \
--outdir <OUTDIR>Use -profile docker or -profile singularity to run with containers (recommended).
Warning
Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.
See docs/usage.md for full documentation on all parameters.
TRON-Bioinformatics/qualibam was originally written by Ivan Baksic.
We thank the following people for their extensive assistance in the development of this pipeline:
- The nf-core community for providing the pipeline template and shared infrastructure.
If you would like to contribute to this pipeline, please see the contributing guidelines.
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.
This pipeline uses code and infrastructure developed and maintained by the nf-core community, reused here under the MIT license.
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.
