Thank you for your interest in contributing to RAPTOR! This project thrives on community input and we welcome contributions from researchers, bioinformaticians, and developers worldwide.
Found a bug? Please help us fix it!
Before reporting:
- Check if the issue already exists in GitHub Issues
- Make sure you're using the latest version
What to include:
- Clear description of the problem
- Steps to reproduce
- Expected vs actual behavior
- Your environment (OS, Python version, RAPTOR version)
- Error messages or logs
- Example data if possible (or synthetic example)
Template:
**Bug Description:**
Brief description
**Steps to Reproduce:**
1. Step one
2. Step two
3. Step three
**Expected Behavior:**
What should happen
**Actual Behavior:**
What actually happens
**Environment:**
- OS: Ubuntu 22.04
- Python: 3.10
- RAPTOR: 2.0.0
**Error Message:**Paste error here
Have an idea to improve RAPTOR?
Good feature requests include:
- Clear description of the feature
- Why it would be useful
- How it relates to RNA-seq analysis
- Example use case
- Any references or papers supporting the idea
Open a GitHub Discussion or Issue with label enhancement.
Documentation improvements are always welcome!
Areas that need help:
- Clarifying existing documentation
- Adding examples
- Fixing typos
- Adding tutorials
- Translating documentation
- Creating video walkthroughs
How to contribute:
- Fork the repository
- Edit files in
docs/folder - Submit a Pull Request
Want to add a new RNA-seq pipeline to RAPTOR?
Requirements:
- Complete workflow (alignment/quantification + statistics)
- Widely used or novel method with publication
- Reproducible implementation
- Tests demonstrating it works
Process:
- Open an Issue to discuss the pipeline
- Create a new folder in
pipelines/ - Follow the pipeline template structure
- Add documentation
- Include test data
- Submit Pull Request
See Pipeline Development Guide for details.
Ran RAPTOR on your data? Share your results!
What to share:
- Dataset characteristics (size, organism, design)
- Pipelines compared
- Performance results
- Any insights or surprises
- Publication reference if applicable
This helps improve recommendations for the community!
Want to fix an existing issue?
Good first issues:
- Look for
good first issuelabel - Issues labeled
help wanted - Documentation improvements
- Test coverage
Before starting:
- Comment on the issue to claim it
- Ask questions if anything is unclear
- Discuss approach if it's a big change
# 1. Fork and clone
git clone https://github.com/YOUR-USERNAME/RAPTOR.git
cd RAPTOR
# 2. Create development environment
conda env create -f environment_dev.yml
conda activate raptor-dev
# 3. Install in development mode
pip install -e .
# 4. Run tests to verify setup
pytest tests/-
Create a branch:
git checkout -b feature/your-feature-name # or git checkout -b fix/bug-description -
Make your changes:
- Write clean, readable code
- Follow existing code style
- Add comments where helpful
- Update documentation if needed
-
Test your changes:
# Run all tests pytest tests/ # Run specific test pytest tests/test_profiler.py # Check code style flake8 raptor/ # Check type hints mypy raptor/
-
Commit your changes:
git add . git commit -m "Add feature: clear description"
Good commit messages:
- Clear and descriptive
- Present tense ("Add feature" not "Added feature")
- Reference issue numbers when applicable
Examples:
- ✅
Add zero-inflation detection to profiler (#42) - ✅
Fix memory leak in benchmark module - ✅
Update documentation for profile command - ❌
fixed stuff - ❌
updates
-
Push to your fork:
git push origin feature/your-feature-name
-
Submit Pull Request:
- Go to GitHub and create Pull Request
- Fill in the PR template
- Link related issues
- Describe what changed and why
- ✅ Code follows project style
- ✅ Tests pass (
pytest tests/) - ✅ Documentation updated if needed
- ✅ No unnecessary files included
- ✅ Commits are clean and logical
- ✅ Branch is up to date with main
## Description
Brief description of changes
## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Documentation update
- [ ] Performance improvement
- [ ] Code refactoring
## Related Issues
Fixes #(issue number)
## Changes Made
- Change 1
- Change 2
- Change 3
## Testing
How did you test this?
## Screenshots (if applicable)
Before/after screenshots
## Checklist
- [ ] Tests pass
- [ ] Documentation updated
- [ ] Code follows style guidelines
- [ ] Commits are clean- Maintainers will review your PR
- You may be asked to make changes
- Once approved, PR will be merged
- You'll be added to contributors list! 🎉
Follow PEP 8:
- Use 4 spaces for indentation
- Max line length: 88 characters (Black formatter default)
- Use meaningful variable names
- Add docstrings to functions
Example:
def calculate_library_size_cv(counts: pd.DataFrame) -> float:
"""
Calculate coefficient of variation for library sizes.
Parameters
----------
counts : pd.DataFrame
Count matrix (genes × samples)
Returns
-------
float
Coefficient of variation (std/mean)
Examples
--------
>>> counts = pd.DataFrame({'S1': [100, 200], 'S2': [150, 250]})
>>> cv = calculate_library_size_cv(counts)
>>> print(f"{cv:.2f}")
0.12
"""
library_sizes = counts.sum(axis=0)
return library_sizes.std() / library_sizes.mean()Follow Bioconductor style:
- Use
<-for assignment - CamelCase for function names
- Meaningful variable names
- Roxygen2 documentation
Example:
#' Calculate DEGs using DESeq2
#'
#' @param counts Count matrix
#' @param metadata Sample metadata
#' @return DESeq2 results object
#' @export
runDESeq2Analysis <- function(counts, metadata) {
dds <- DESeqDataSetFromMatrix(
countData = counts,
colData = metadata,
design = ~ condition
)
dds <- DESeq(dds)
return(results(dds))
}Best practices:
- Use
#!/bin/bashshebang - Quote variables:
"$variable" - Check exit codes
- Add comments
Good tests are:
- Independent (can run in any order)
- Repeatable (same result every time)
- Fast (avoid slow operations)
- Clear (easy to understand what's tested)
Example:
import pytest
import pandas as pd
from raptor.profiler import RNAseqDataProfiler
def test_library_size_calculation():
"""Test that library sizes are calculated correctly."""
# Create test data
counts = pd.DataFrame({
'S1': [100, 200, 300],
'S2': [150, 250, 350]
})
# Expected library sizes
expected = pd.Series([600, 750], index=['S1', 'S2'])
# Calculate
profiler = RNAseqDataProfiler(counts)
result = profiler.calculate_library_sizes()
# Assert
pd.testing.assert_series_equal(result, expected)
def test_handles_zero_inflation():
"""Test profiler handles highly zero-inflated data."""
# Create zero-inflated data
counts = pd.DataFrame({
'S1': [0, 0, 0, 100],
'S2': [0, 0, 0, 150]
})
profiler = RNAseqDataProfiler(counts)
zero_pct = profiler.calculate_zero_percentage()
assert zero_pct == 75.0 # 6 zeros out of 8 values# All tests
pytest tests/
# With coverage
pytest --cov=raptor tests/
# Verbose output
pytest -v tests/
# Specific test
pytest tests/test_profiler.py::test_library_size_calculation
# Stop on first failure
pytest -x tests/Use NumPy style docstrings:
def recommend_pipeline(profile: dict, priority: str = 'balanced') -> dict:
"""
Recommend optimal pipeline based on data profile.
This function analyzes data characteristics and matches them to
pipeline strengths using a scoring system.
Parameters
----------
profile : dict
Data profile containing statistical characteristics
priority : str, optional
Optimization priority: 'accuracy', 'speed', 'memory', or 'balanced'
Default is 'balanced'
Returns
-------
dict
Recommendation with structure:
{
'pipeline_id': int,
'pipeline_name': str,
'score': float,
'reasoning': list of str
}
Raises
------
ValueError
If priority is not one of the valid options
Examples
--------
>>> profile = {'library_size_cv': 0.3, 'n_samples': 6}
>>> rec = recommend_pipeline(profile, priority='accuracy')
>>> print(rec['pipeline_name'])
'STAR-RSEM-DESeq2'
Notes
-----
The scoring system weighs different factors based on priority:
- accuracy: Emphasizes sensitivity and precision
- speed: Prioritizes fast methods
- memory: Selects low-memory pipelines
- balanced: Data-driven weighting
See Also
--------
RNAseqDataProfiler : For generating profiles
PipelineBenchmark : For validating recommendations
"""
# Implementation here
passWhen adding features:
- Update main README.md
- Add to appropriate section
- Update table of contents if needed
- Add example usage
- Update Quick Start if it changes workflow
All contributors will be:
- Listed in AUTHORS.md
- Mentioned in release notes
- Credited in documentation
- Added to Zenodo author list (for DOI)
Major contributions may result in:
- Co-authorship on future papers
- Maintainer status
- Your name in the tool itself
- 💬 GitHub Discussions: For general questions
- 🐛 GitHub Issues: For bugs and feature requests
- 📧 Email: ayehbolouki1988@gmail.com for private matters
- Be respectful and professional
- Stay on topic
- Search before asking (question may be answered)
- Provide context and details
- Be patient - maintainers are volunteers
RAPTOR is committed to providing a welcoming, inclusive environment for all contributors regardless of:
- Background or identity
- Experience level
- Geographic location
- Institutional affiliation
- Use welcoming and inclusive language
- Respect differing viewpoints
- Accept constructive criticism gracefully
- Focus on what's best for the community
- Show empathy toward others
- Harassment or discrimination
- Trolling or inflammatory comments
- Personal or political attacks
- Publishing others' private information
- Unprofessional conduct
Violations may result in:
- Warning
- Temporary ban
- Permanent ban
Report issues to: ayehbolouki1988@gmail.com
- Adding new pipelines
- Improving recommendation algorithm
- Performance optimizations
- Bug fixes
- Documentation improvements
- Additional visualizations
- New metrics
- Extended format support
- Web interface
- Single-cell RNA-seq support
- Machine learning enhancements
- Cloud deployment options
- Interactive dashboard
Every contribution, no matter how small, helps make RAPTOR better for the entire research community. Thank you for being part of this open science initiative!
Let's make free science for everybody around the world! 🦖
By contributing to RAPTOR, you agree that your contributions will be licensed under the MIT License.