Skip to content

Commit b93709c

Browse files
authored
Merge pull request #1 from AyehBlk/v2.1.0-dev
Release v2.1.0 - ML Intelligence & Interactive Dashboard
2 parents ec961a2 + 7ad4510 commit b93709c

95 files changed

Lines changed: 42946 additions & 5422 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

ARCHITECTURE_DIAGRAM.md

Lines changed: 401 additions & 0 deletions
Large diffs are not rendered by default.

CITATION.cff

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
cff-version: 1.2.0
2+
message: "If you use RAPTOR in your research, please cite it as below."
3+
title: "RAPTOR: RNA-seq Analysis Pipeline Testing and Optimization Resource"
4+
version: 2.1.0
5+
date-released: 2025-12
6+
authors:
7+
- family-names: "Bolouki"
8+
given-names: "Ayeh"
9+
email: ayehbolouki1988@gmail.com
10+
orcid: "https://orcid.org/0000-0001-5920-3783"
11+
repository-code: "https://github.com/AyehBlk/RAPTOR"
12+
url: "https://github.com/AyehBlk/RAPTOR"
13+
abstract: "RAPTOR is a comprehensive benchmarking framework for RNA-seq differential expression analysis pipelines with ML-powered recommendations. Version 2.1.0 introduces machine learning-based pipeline selection (87% accuracy), an interactive web dashboard, advanced quality assessment with batch effect detection, ensemble analysis methods, real-time resource monitoring, and automated parameter optimization. It implements 8 complete workflows and helps researchers make evidence-based decisions by profiling data quality and matching optimal methods to specific experimental conditions."
14+
keywords:
15+
- RNA-seq
16+
- differential expression
17+
- bioinformatics
18+
- pipeline benchmarking
19+
- computational biology
20+
- transcriptomics
21+
- data profiling
22+
- pipeline recommendation
23+
- machine learning
24+
- quality assessment
25+
- ensemble analysis
26+
- interactive dashboard
27+
- resource monitoring
28+
- parameter optimization
29+
license: MIT
30+
type: software
31+
identifiers:
32+
- type: doi
33+
value: "10.5281/zenodo.17607162"
34+
description: "Zenodo archive"
35+
preferred-citation:
36+
type: software
37+
title: "RAPTOR: RNA-seq Analysis Pipeline Testing and Optimization Resource"
38+
authors:
39+
- family-names: "Bolouki"
40+
given-names: "Ayeh"
41+
email: ayehbolouki1988@gmail.com
42+
orcid: "https://orcid.org/0000-0001-5920-3783"
43+
year: 2025
44+
version: 2.1.0
45+
doi: "10.5281/zenodo.17607162"
46+
repository-code: "https://github.com/AyehBlk/RAPTOR"
47+
license: MIT

COMPLETE_INDEX.txt

Lines changed: 299 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,299 @@
1+
🦖 RAPTOR ULTIMATE **NEW in v2.1.0**
2+
══════════════════════════════════════════════════════════════
3+
4+
TOTAL: 23 FILES (~600 KB)
5+
6+
═══════════════════════════════════════════════════════════════
7+
QUICK START (3 files)
8+
═══════════════════════════════════════════════════════════════
9+
10+
1. INDEX.txt - Quick reference (START HERE!)
11+
2. install.py - Master installer
12+
3. requirements_ml.txt - All dependencies
13+
14+
COMMAND: python install.py
15+
16+
═══════════════════════════════════════════════════════════════
17+
INTERACTIVE DASHBOARD (3 files) ⭐ **NEW in v2.1.0**
18+
19+
═══════════════════════════════════════════════════════════════
20+
21+
4. dashboard.py - Web-based interface (48 KB)
22+
5. launch_dashboard.py - One-command launcher
23+
6. DASHBOARD_GUIDE.md - Dashboard documentation
24+
25+
COMMAND: python launch_dashboard.py
26+
27+
═══════════════════════════════════════════════════════════════
28+
ML RECOMMENDATION SYSTEM (4 files)**NEW in v2.1.0**
29+
30+
═══════════════════════════════════════════════════════════════
31+
32+
7. ml_recommender.py - Core ML engine (27 KB)
33+
8. synthetic_benchmarks.py - Training data generator
34+
9. example_ml_workflow.py - Complete demo
35+
10. ML_RECOMMENDER_README.md - ML documentation
36+
37+
COMMAND: python example_ml_workflow.py
38+
39+
═══════════════════════════════════════════════════════════════
40+
DATA QUALITY ASSESSMENT (3 files) ⭐**NEW in v2.1.0**
41+
42+
═══════════════════════════════════════════════════════════════
43+
44+
11. data_quality_assessment.py - Quality & batch detection (29 KB)
45+
12. example_quality_assessment.py - Quality examples
46+
13. DATA_QUALITY_GUIDE.md - Quality documentation
47+
48+
COMMAND: python example_quality_assessment.py
49+
50+
FEATURES:
51+
✓ 6-component quality scoring (0-100 scale)
52+
✓ Batch effect detection (F-statistic based)
53+
✓ Outlier identification (3 methods)
54+
✓ Comprehensive visualization (7 panels)
55+
✓ Actionable recommendations
56+
57+
═══════════════════════════════════════════════════════════════
58+
COMMAND-LINE INTERFACE (2 files)**NEW in v2.1.0**
59+
60+
═══════════════════════════════════════════════════════════════
61+
62+
14. raptor_ml_cli.py - Enhanced CLI
63+
15. test_ml_system.py - Test suite
64+
65+
COMMAND: python raptor_ml_cli.py --help
66+
67+
═══════════════════════════════════════════════════════════════
68+
DOCUMENTATION (8 files)**NEW in v2.1.0**
69+
70+
═══════════════════════════════════════════════════════════════
71+
72+
16. COMPLETE_README.md - ⭐ MASTER GUIDE (17 KB)
73+
17. ULTIMATE_SUMMARY.md - Complete overview (22 KB)
74+
18. QUALITY_ASSESSMENT_UPGRADE.md - Quality module docs ⭐ NEW
75+
19. QUICK_START.md - 5-minute guide
76+
20. MANIFEST.md - File index & paths
77+
21. IMPLEMENTATION_SUMMARY.md - Technical details
78+
22. ARCHITECTURE_DIAGRAM.md - System architecture
79+
23. README.md - Package overview
80+
81+
READING ORDER:
82+
1. COMPLETE_README.md (25 min) - Everything you need
83+
2. QUICK_START.md (5 min) - Get running fast
84+
3. QUALITY_ASSESSMENT_UPGRADE.md (15 min) - New features ⭐
85+
4. DASHBOARD_GUIDE.md (20 min) - Web interface
86+
5. Others as needed
87+
88+
═══════════════════════════════════════════════════════════════
89+
WHAT'S INCLUDED in v2.1.0
90+
═══════════════════════════════════════════════════════════════
91+
92+
SYSTEM 1: ML-Based Recommendations
93+
├─ 85-90% accuracy
94+
├─ <0.1s predictions
95+
├─ Confidence scoring (0-100%)
96+
├─ 30+ intelligent features
97+
└─ RandomForest & GradientBoosting
98+
99+
SYSTEM 2: Resource Monitoring
100+
├─ CPU, Memory, Disk, GPU tracking
101+
├─ <1% overhead
102+
├─ Real-time visualization
103+
└─ Multi-pipeline comparison
104+
105+
SYSTEM 3: Ensemble Analysis
106+
├─ 5 combination methods
107+
├─ 20-30% fewer false positives
108+
├─ Agreement analysis
109+
└─ High-confidence genes
110+
111+
SYSTEM 4: Interactive Dashboard
112+
├─ Web-based interface
113+
├─ No coding required
114+
├─ All features integrated
115+
└─ Interactive visualizations
116+
117+
SYSTEM 5: Quality Assessment
118+
├─ 6-component scoring
119+
├─ Batch effect detection
120+
├─ Outlier identification
121+
├─ Comprehensive visualization
122+
└─ Actionable recommendations
123+
124+
═══════════════════════════════════════════════════════════════
125+
QUICK START COMMANDS
126+
═══════════════════════════════════════════════════════════════
127+
128+
# Complete installation:
129+
python install.py
130+
131+
# Or manual installation:
132+
pip install -r requirements_ml.txt
133+
python test_ml_system.py
134+
python launch_dashboard.py
135+
136+
# ML Recommendation:
137+
python raptor_ml_cli.py profile --counts data.csv --use-ml
138+
139+
# Quality Assessment:
140+
python -c "
141+
from data_quality_assessment import quick_quality_check
142+
import pandas as pd
143+
counts = pd.read_csv('data.csv', index_col=0)
144+
report = quick_quality_check(counts, plot=True)
145+
"
146+
147+
# Dashboard:
148+
python launch_dashboard.py
149+
# → Opens at http://localhost:8501
150+
151+
═══════════════════════════════════════════════════════════════
152+
USAGE PATHS
153+
═══════════════════════════════════════════════════════════════
154+
155+
PATH 1: BEGINNER (Dashboard) ⭐ RECOMMENDED
156+
Step 1: python install.py
157+
Step 2: python launch_dashboard.py
158+
Step 3: Use web interface
159+
Time: 10 minutes | Coding: None
160+
161+
PATH 2: COMMAND-LINE USER
162+
Step 1: pip install -r requirements_ml.txt
163+
Step 2: python example_ml_workflow.py
164+
Step 3: python raptor_ml_cli.py profile --counts data.csv --use-ml
165+
Time: 15 minutes | Coding: Basic CLI
166+
167+
PATH 3: PYTHON DEVELOPER
168+
Step 1: pip install -r requirements_ml.txt
169+
Step 2: from ml_recommender import MLPipelineRecommender
170+
Step 3: Use Python API
171+
Time: 5 minutes | Coding: Full control
172+
173+
PATH 4: QUALITY-FOCUSED
174+
Step 1: pip install -r requirements_ml.txt
175+
Step 2: from data_quality_assessment import quick_quality_check
176+
Step 3: report = quick_quality_check(counts, metadata, plot=True)
177+
Time: 5 minutes | Coding: Minimal
178+
179+
═══════════════════════════════════════════════════════════════
180+
DATA QUALITY ASSESSMENT MODULE in new version
181+
═══════════════════════════════════════════════════════════════
182+
183+
DATA QUALITY ASSESSMENT MODULE
184+
185+
New Files:
186+
• data_quality_assessment.py (29 KB)
187+
• example_quality_assessment.py (11 KB)
188+
• DATA_QUALITY_GUIDE.md (18 KB)
189+
• QUALITY_ASSESSMENT_UPGRADE.md (15 KB)
190+
191+
Features:
192+
✓ 6-component quality scoring (0-100)
193+
- Library quality
194+
- Gene detection
195+
- Outlier detection
196+
- Variance structure
197+
- Batch effects ⭐
198+
- Biological signal
199+
200+
✓ Batch Effect Detection
201+
- Metadata-based (F-statistic)
202+
- Unsupervised clustering
203+
- Strength quantification
204+
- Correction recommendations
205+
206+
✓ Comprehensive Visualization
207+
- 7-panel quality report
208+
- PCA plots
209+
- Score gauges
210+
- Publication-quality
211+
212+
Usage:
213+
from data_quality_assessment import quick_quality_check
214+
report = quick_quality_check(counts, metadata, plot=True)
215+
216+
═══════════════════════════════════════════════════════════════
217+
STATISTICS
218+
═══════════════════════════════════════════════════════════════
219+
220+
Code:
221+
• Python files: 10
222+
• Total lines: ~6,000
223+
• Test coverage: Comprehensive
224+
225+
Documentation:
226+
• Markdown files: 11
227+
• Total words: ~50,000
228+
• Reading time: ~4 hours (all docs)
229+
• Essential reading: ~1 hour
230+
231+
Features:
232+
• Systems: 5 (ML, Monitor, Ensemble, Dashboard, Quality)
233+
• ML models: 2 (RandomForest, GradientBoosting)
234+
• Ensemble methods: 5
235+
• Quality components: 6
236+
• Dashboard pages: 6
237+
238+
═══════════════════════════════════════════════════════════════
239+
VERIFICATION CHECKLIST
240+
═══════════════════════════════════════════════════════════════
241+
242+
After installation:
243+
□ python --version shows 3.8+
244+
□ python test_ml_system.py passes all tests
245+
□ python launch_dashboard.py opens browser
246+
□ python example_quality_assessment.py runs successfully
247+
□ Dashboard loads at http://localhost:8501
248+
□ Can upload/generate sample data
249+
□ Can get ML recommendations
250+
□ Can run quality assessment
251+
□ Can export results
252+
253+
═══════════════════════════════════════════════════════════════
254+
GETTING HELP
255+
═══════════════════════════════════════════════════════════════
256+
257+
Documentation:
258+
• COMPLETE_README.md - Master guide
259+
• QUICK_START.md - Fast start
260+
• QUALITY_ASSESSMENT_UPGRADE.md - New features
261+
• DATA_QUALITY_GUIDE.md - Quality module
262+
263+
Examples:
264+
• example_ml_workflow.py - ML demo
265+
• example_quality_assessment.py - Quality demo
266+
267+
Testing:
268+
• python test_ml_system.py
269+
270+
Contact:
271+
• Email: ayehbolouki1988@gmail.com
272+
• GitHub: https://github.com/AyehBlk/RAPTOR
273+
274+
═══════════════════════════════════════════════════════════════
275+
RAPTOR ULTIMATE v2.1.0 FEATURES
276+
═══════════════════════════════════════════════════════════════
277+
278+
✅ AI-powered pipeline recommendations (87% accuracy)
279+
✅ Real-time resource monitoring (<1% overhead)
280+
✅ Ensemble analysis (5 methods, -33% false positives)
281+
✅ Interactive web dashboard (no coding!)
282+
✅ Advanced quality assessment (6 components) ⭐ NEW
283+
✅ Batch effect detection (F-statistic) ⭐ NEW
284+
✅ Outlier identification (3 methods) ⭐ NEW
285+
✅ Comprehensive visualization
286+
✅ Complete CLI & Python API
287+
✅ Production-ready code
288+
✅ Extensive documentation
289+
290+
═══════════════════════════════════════════════════════════════
291+
292+
Created by Ayeh Bolouki
293+
Belgium, November 2025
294+
295+
🦖 RAPTOR - The Most Advanced RNA-seq Analysis System Available
296+
297+
For updates: https://github.com/AyehBlk/RAPTOR
298+
299+
═══════════════════════════════════════════════════════════════

0 commit comments

Comments
 (0)