Deep dive into all 8 RNA-seq analysis pipelines.
| ID | Name | Type | Speed | Accuracy | Memory |
|---|---|---|---|---|---|
| 1 | STAR-RSEM-DESeq2 | Alignment | ⚫⚫⚪⚪⚪ | ⚫⚫⚫⚫⚫ | High |
| 2 | HISAT2-StringTie-Ballgown | Alignment | ⚫⚫⚫⚪⚪ | ⚫⚫⚫⚫⚪ | Medium |
| 3 | Salmon-edgeR | Pseudo-align | ⚫⚫⚫⚫⚫ | ⚫⚫⚫⚫⚪ | Low |
| 4 | Kallisto-Sleuth | Pseudo-align | ⚫⚫⚫⚫⚫ | ⚫⚫⚫⚪⚪ | Low |
| 5 | STAR-HTSeq-limma | Alignment | ⚫⚫⚪⚪⚪ | ⚫⚫⚫⚫⚪ | High |
| 6 | STAR-featureCounts-NOISeq | Alignment | ⚫⚫⚪⚪⚪ | ⚫⚫⚫⚪⚪ | High |
| 7 | Bowtie2-RSEM-EBSeq | Alignment | ⚫⚫⚪⚪⚪ | ⚫⚫⚫⚪⚪ | Medium |
| 8 | HISAT2-Cufflinks-Cuffdiff | Alignment | ⚫⚪⚪⚪⚪ | ⚫⚫⚪⚪⚪ | Medium |
Gold Standard - Highest Accuracy
- Alignment: STAR (splice-aware)
- Quantification: RSEM (EM algorithm)
- Statistics: DESeq2 (negative binomial)
- Publication-quality results
- Small sample sizes (n<6)
- Complex experimental designs
- When accuracy is paramount
- Runtime: ~2-6 hours (typical dataset)
- Memory: 32-48 GB
- Accuracy: 95%
raptor run --pipeline 1 \
--data fastq/ \
--reference /path/to/star_index \
--annotation genes.gtf \
--output pipeline1_results/Transcript Assembly & Novel Discovery
- Alignment: HISAT2
- Assembly: StringTie
- Statistics: Ballgown
- Novel transcript discovery
- Isoform-level analysis
- Non-model organisms
- When reference incomplete
- Runtime: ~1-4 hours
- Memory: 16-24 GB
- Accuracy: 88%
Best Balance - Fast & Accurate
- Quantification: Salmon (quasi-mapping)
- Statistics: edgeR (quasi-likelihood)
- Most RNA-seq experiments
- Large datasets (>20 samples)
- Quick turnaround needed
- Good balance of all metrics
- Runtime: ~0.5-2 hours
- Memory: 8-16 GB
- Accuracy: 90%
raptor run --pipeline 3 \
--data fastq/ \
--transcriptome /path/to/salmon_index \
--output pipeline3_results/Ultra-Fast - Large Studies
- Quantification: Kallisto
- Statistics: Sleuth (bootstrap-based)
- Very large datasets (>50 samples)
- Exploratory analysis
- Minimal resources
- Speed is critical
- Runtime: ~0.3-1 hour
- Memory: 4-8 GB
- Accuracy: 88%
Flexible Modeling - Complex Designs
- Alignment: STAR
- Counting: HTSeq
- Statistics: limma-voom
- Complex experimental designs
- Multi-factor analysis
- Batch correction needed
- Repeated measures
- Runtime: ~2-7 hours
- Memory: 32-40 GB
- Accuracy: 92%
Non-Parametric - Small Samples
- Alignment: STAR
- Counting: featureCounts
- Statistics: NOISeq
- Very small samples (n=2-3)
- No replicates
- Non-normal distributions
- Runtime: ~2-7 hours
- Memory: 32-36 GB
- Accuracy: 85%
Memory-Efficient Alternative
- Alignment: Bowtie2
- Quantification: RSEM
- Statistics: EBSeq
- Moderate resource environments
- Isoform-level analysis
- Two-condition comparisons
- Runtime: ~3-8 hours
- Memory: 16-24 GB
- Accuracy: 87%
Legacy Pipeline
- Alignment: HISAT2
- Assembly: Cufflinks
- Statistics: Cuffdiff
- Reproducing legacy analyses
- When Cufflinks ecosystem required
- Runtime: ~4-12 hours
- Memory: 20-32 GB
- Accuracy: 82%
Note: Newer methods preferred for new projects
Use the decision guide:
- Need highest accuracy? → Pipeline 1
- Large dataset? → Pipeline 3 or 4
- Novel transcripts? → Pipeline 2
- Complex design? → Pipeline 5
- Small samples? → Pipeline 6
- Not sure? → Use
raptor profile
See configuration details in config/pipelines.yaml