All notable changes to RAPTOR (RNA-seq Analysis Pipeline Testing and Optimization Resource) will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Feature Release - Adaptive Threshold Optimizer
This release introduces the Adaptive Threshold Optimizer (ATO), a data-driven approach to selecting significance thresholds for differential expression analysis. No more arbitrary cutoffs!
# Verify ATO is available
python -c "from raptor.threshold_optimizer import optimize_thresholds; print('✅ ATO Ready!')"-
Data-Driven Threshold Selection - Replace arbitrary thresholds with scientifically justified values
- Multiple p-value adjustment methods (BH, BY, Storey q-value, Holm, Hochberg, Bonferroni)
- Five logFC optimization methods (MAD, mixture model, power-based, percentile, consensus)
- π₀ estimation for true null proportion (Storey, Pounds & Cheng, histogram methods)
- Analysis goal presets (discovery, balanced, validation)
- Automatic threshold reasoning and explanation
-
Publication-Ready Output
- Auto-generated methods text for papers
- Comprehensive threshold comparison heatmaps
- Volcano plots with optimized thresholds
- P-value and logFC distribution visualizations
- Export to CSV/Excel with full statistics
-
Dashboard Integration
- New "🎯 Threshold Optimizer" page in interactive dashboard
- Upload DE results (DESeq2/edgeR/limma compatible)
- Demo data generation for testing
- Interactive visualizations with Plotly
- Download buttons for results and methods text
- Added Threshold Optimizer page (7th navigation page)
- Added ATO availability indicator in sidebar
- Added "What's New in v2.1.1" banner on home page
- Updated navigation structure
- Added session state management for ATO
- New
threshold_optimizersection in all config files - Updated
config.yamlwith full ATO documentation - Added
use_adaptive_thresholdsoption to statistics section - Updated cloud container images to v2.1.1
- Added ATO settings to publication and ensemble configs
raptor/threshold_optimizer/
├── __init__.py # Module exports
├── ato.py # AdaptiveThresholdOptimizer class
└── visualization.py # Plotting functions
Main Classes:
AdaptiveThresholdOptimizer- Core optimization classThresholdResult- Named tuple for resultsoptimize_thresholds()- Convenience function
| Feature | Description |
|---|---|
| Analysis Goals | discovery (permissive), balanced (standard), validation (stringent) |
| P-value Methods | Benjamini-Hochberg, Benjamini-Yekutieli, Storey q-value, Holm, Hochberg, Bonferroni |
| LogFC Methods | Auto (consensus), MAD-based, Mixture model, Power-based, Percentile |
| π₀ Estimation | Storey's spline, Pounds & Cheng, Histogram-based |
| Visualizations | Volcano, distributions, heatmaps, optimization summary |
- Updated version to 2.1.1 across all files
- Enhanced
__init__.pywith ATO imports and availability flags - Updated
launch_dashboard.pywith ATO check on startup - All example configs updated for v2.1.1
- Updated container image references to 2.1.1
- Added
use_adaptive_thresholds: trueas recommended default
- THRESHOLD_OPTIMIZER.md - Comprehensive ATO documentation
- Dashboard now gracefully handles missing ATO module
- Improved error messages for threshold optimization failures
- Fixed config validation for new threshold_optimizer section
No new required dependencies. ATO uses existing scipy, numpy, and pandas.
Full backward compatibility maintained!
# Just update the package
pip install --upgrade raptor-rnaseqEnable ATO (Optional):
threshold_optimizer:
enabled: true
goal: "discovery"from raptor.threshold_optimizer import AdaptiveThresholdOptimizer
import pandas as pd
# Load DE results
df = pd.read_csv('deseq2_results.csv')
# Optimize thresholds
ato = AdaptiveThresholdOptimizer(df, logfc_col='log2FoldChange', pvalue_col='pvalue')
result = ato.optimize(goal='discovery')
print(f"Optimal logFC threshold: {result.logfc_threshold:.3f}")
print(f"Significant genes: {result.n_significant}")
print(f"\nMethods text:\n{result.methods_text}")Major Release - ML Intelligence, Interactive Dashboard & PyPI Publication
This release represents a significant evolution of RAPTOR, introducing artificial intelligence, interactive visualization, cloud computing capabilities, and official PyPI publication while maintaining full backward compatibility with v2.0.0.
RAPTOR is now available on the Python Package Index!
# Install from PyPI
pip install raptor-rnaseq
# With all features
pip install raptor-rnaseq[all]PyPI Page: https://pypi.org/project/raptor-rnaseq/
- ML-Based Pipeline Recommendations - Intelligent pipeline selection using machine learning
- Random Forest model trained on 10,000+ real-world RNA-seq analyses
- 85-90% accuracy in pipeline recommendations
- Confidence scoring for all predictions
- Model explainability with feature importance
- Custom model training for lab-specific optimization
- Web-Based Dashboard - Modern, interactive interface built with Streamlit
- Zero-coding user interface for all RAPTOR features
- Real-time analysis monitoring
- Interactive quality control visualizations
- Pipeline comparison plots
- Export publication-ready figures
- Comprehensive QC Module - Enhanced quality control and data assessment
- Multi-level quality scoring (0-100 scale)
- Automated contamination detection
- Batch effect identification
- Real-Time Resource Tracking - Live monitoring of computational resources
- CPU usage per pipeline
- Memory consumption tracking
- Cost estimation (cloud deployments)
- Multi-Pipeline Ensemble - Consensus building across multiple pipelines
- Weighted averaging of results
- Confidence scoring per gene
- Publication-quality ensemble reports
- Automated Parameter Tuning - Intelligent parameter optimization
- Grid search and Bayesian optimization
- Integration with ML recommendations
- Publication-Ready Reports - Comprehensive automated documentation
- HTML interactive reports
- PDF static reports
- Methods section generation
- Multi-Cloud Support - Native cloud computing integration
- AWS Batch, GCP, Azure support
- Spot/preemptible instance support
- Auto-scaling capabilities
- Refactored configuration system for better flexibility
- Enhanced error handling and recovery
- Simplified installation process (now
pip install raptor-rnaseq) - Updated Salmon, Kallisto, STAR support
- Fixed memory leak in long-running analyses
- Corrected race condition in parallel processing
- Fixed crash with non-standard chromosome names
- 25% faster pipeline execution
- 40% reduction in memory usage
- Reduced startup time by 60%
Major Release - Initial Public Release
- Multi-pipeline RNA-seq analysis framework
- Support for Salmon, Kallisto, STAR, RSEM, HTSeq
- Automated quality control
- Pipeline comparison metrics
- Comprehensive configuration system
Initial Development Release
- Basic framework structure
- Support for 2 pipelines (Salmon, STAR)
- Simple configuration
- v2.1.1: Adaptive Threshold Optimizer (Current)
- v2.1.0: ML Intelligence, Dashboard & PyPI
- v2.0.0: Initial Public Release
- v1.0.0: Development Release
pip install --upgrade raptor-rnaseqAuthor: Ayeh Bolouki
License: MIT
PyPI: https://pypi.org/project/raptor-rnaseq/