Workflow Benchmark

This repository starts from a published GEO count matrix, so the benchmark below focuses on differential expression and reproducibility workflow quality, not on FASTQ-level alignment or quantification.

Reference workflows

DESeq2 vignette: recommends effect-size shrinkage for ranking and visualization.
nf-core/rnaseq: exemplifies a mature end-to-end RNA-seq workflow, especially standardized QC and reproducible execution from raw reads.
targets user manual: exemplifies dependency-aware skipping and pipeline orchestration.
workflowr: exemplifies research provenance via git-aware reporting and session/environment capture.

Current alignment

DE effect estimation: raw DESeq2 inference plus apeglm-shrunken log2 fold changes for ranking, MA plotting, and volcano visualization.
Robustness: balanced-subset analysis is benchmarked against the full QC-passed cohort, with overlap and concordance written to results/tables/analysis_summary.csv.
Reproducibility: renv-pinned environment, deterministic seeds, GitHub Actions rebuilds, and explicit git/session provenance in results/session_info.txt.
Artifact validation: tracked tables are rebuilt in CI and compared against committed results; key figures are checked for successful regeneration.

Remaining gaps relative to broader workflows

Upstream RNA-seq processing: unlike nf-core/rnaseq, this repo does not perform raw-read QC, alignment, quantification, or MultiQC because the starting point is the GEO count matrix.
Pipeline engine: the workflow remains a sequential R-script orchestrator rather than a declarative DAG like targets.
Model complexity: the primary DE model remains ~ condition; covariates such as age, sex, viral load, or inferred cell composition are not yet modeled explicitly.

Scope and rationale

The codebase is intentionally small and reviewable for a focused secondary analysis.
The robustness layer addresses the main biological risk introduced by the balanced-subset design without requiring a full re-architecture.
The remaining gaps are explicit, documented, and suitable targets for future extension.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workflow Benchmark

Reference workflows

Current alignment

Remaining gaps relative to broader workflows

Scope and rationale

FilesExpand file tree

WORKFLOW_BENCHMARK.md

Latest commit

History

WORKFLOW_BENCHMARK.md

File metadata and controls

Workflow Benchmark

Reference workflows

Current alignment

Remaining gaps relative to broader workflows

Scope and rationale