Minimal end-to-end RNA-seq pipeline: FASTQ -> FastQC -> cutadapt -> STAR -> sorted BAM -> featureCounts -> MultiQC → DESeq2
This repository includes a small test setup so you can run an end-to-end check locally.
- Nextflow (tested with
25.10.3) - Docker
- Git
From the repo root:
# (Optional) build the local containers (only needed if you use local Docker builds)
docker build -t rnaseq-tools:0.1.0 containers/rnaseq-tools
docker build -t rnaseq-r:0.1.0 containers/rnaseq-r
# run the test
nextflow run main.nf -profile docker \
--samplesheet tests/samplesheet.test.csv \
--genome_fasta tests/ref/genome.fasta \
--genome_gtf tests/ref/genes.gtf \
--design tests/design.test.tsv \
--outdir results_test \
-with-trace "trace-$(date +%Y%m%d-%H%M%S).txt" \
-with-report "report-$(date +%Y%m%d-%H%M%S).html" \
-with-timeline "timeline-$(date +%Y%m%d-%H%M%S).html" \
-resumeIf you reuse the same report.html / timeline.html, Nextflow will refuse to overwrite them.
Using timestamped names (as above) avoids that.
CSV with columns:
sample- unique sample IDfastq_1- path to R1 FASTQ.gzfastq_2- path to R2 FASTQ.gz
Example: tests/samplesheet.test.csv
TSV with columns:
samplecondition(e.g.WT/KO)
Example: tests/design.test.tsv
--genome_fasta- reference genome FASTA--genome_gtf- annotation GTF
Test reference in: tests/ref/
Output folder: --outdir (example: results_test/)
Key outputs:
counts.tsv- gene counts (featureCounts)qc/multiqc_report.html- aggregated QC reportdeseq2_results.tsv- DESeq2 differential expression resultsdeseq2_ma_plot.png- MA plotdeseq2_pca_plot.png- PCA plot
main.nf- Nextflow pipeline (DSL2)nextflow.config- pipeline configurationbin/deseq2.R- DESeq2 scriptcontainers/- container build context(s)tests/- test samplesheet/design/reference and helper scripts
- The test reference and test FASTQs are intentionally small; they are for validating the workflow wiring, not biological interpretation.
- For real datasets, update references, resources (cpus/memory), and output locations accordingly.
