Skip to content

Latest commit

 

History

History
445 lines (324 loc) · 10.4 KB

File metadata and controls

445 lines (324 loc) · 10.4 KB

Contributing to WorldStrat Ensemble

Thank you for your interest in contributing to the WorldStrat Ensemble project! We are committed to building a robust, high-performance super-resolution pipeline for satellite imagery.


📋 Table of Contents


🤝 How to Contribute

Reporting Bugs

Found a bug? Help us fix it by creating a detailed GitHub Issue:

Required Information:

  1. Title: Clear, concise summary (e.g., "CUDA OOM error on batch_size=2")
  2. Description: What happened vs. what you expected
  3. Steps to Reproduce:
    1. Open ENSEMBLE_FINAL_ROBUST.ipynb
    2. Set batch_size = 4 in Cell 2
    3. Run Cell 6 (Model Loading)
    4. Error occurs
    
  4. Error Logs: Full traceback in a code block
  5. Environment:
    • OS: Ubuntu 20.04 / Windows 11 / macOS 13
    • Python: 3.9.7
    • PyTorch: 2.0.1+cu118
    • GPU: NVIDIA T4 (16GB)
    • CUDA: 11.8

Label: bug, priority:high (if blocking), needs-investigation

Suggesting Enhancements

Have an idea to improve performance or usability?

Enhancement Proposal Template:

### Problem
Current ensemble uses equal/softmax/proportional weighting, but optimal weights may vary per image.

### Proposed Solution
Implement per-image adaptive weighting based on local image statistics (variance, edge density).

### Expected Benefit
- PSNR improvement: +0.1-0.2 dB
- Minimal performance overhead: <5% slower

### Implementation Plan
1. Add feature extraction module
2. Train lightweight MLP predictor
3. Integrate into ensemble pipeline

### Alternatives Considered
- Fixed learned weights (rejected: no adaptability)
- Attention-based fusion (rejected: too slow)

Label: enhancement, discussion, performance or usability


🛠️ Development Setup

1. Fork and Clone

# Fork the repo on GitHub, then:
git clone https://github.com/YOUR_USERNAME/klymo.git
cd klymo
git remote add upstream https://github.com/Aditya26189/klymo.git

2. Create Virtual Environment

# Using venv
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Or using conda
conda create -n klymo-dev python=3.9
conda activate klymo-dev

3. Install Development Dependencies

# Core dependencies
pip install -r requirements.txt

# Development tools
pip install flake8 black isort pytest pytest-cov mypy

# Pre-commit hooks (optional but recommended)
pip install pre-commit
pre-commit install

4. Download Test Data

# Small validation set for testing (10 samples, ~50MB)
wget https://example.com/worldstrat-dev-data.zip
unzip worldstrat-dev-data.zip -d test_data/

📝 Coding Standards

Style Guide

We follow PEP 8 with some project-specific conventions:

# ✅ GOOD: Clear variable names, type hints, docstring
def compute_ensemble_weights(
    psnr_model1: float,
    psnr_model2: float,
    temperature: float = 2.0
) -> Tuple[float, float]:
    """
    Compute ensemble weights using softmax with temperature.
    
    Args:
        psnr_model1: Validation PSNR of first model (dB).
        psnr_model2: Validation PSNR of second model (dB).
        temperature: Softmax temperature for smoothing.
    
    Returns:
        Tuple of weights (w1, w2) summing to 1.0.
    """
    psnr_diff = abs(psnr_model1 - psnr_model2)
    # ... implementation
# ❌ BAD: No types, unclear names, missing docstring
def calc(p1, p2, t=2):
    d = abs(p1-p2)
    # ... implementation

Formatting

Run before committing:

# Auto-format with black (line length 100)
black --line-length 100 *.py

# Sort imports
isort *.py

# Check style
flake8 *.py --max-line-length=100 --ignore=E203,W503

Type Hints

  • Required for all new functions
  • Use typing module: List, Dict, Optional, Tuple
  • Run mypy to verify:
    mypy WORLDSTRAT_ENSEMBLE_CORRECTED.py --ignore-missing-imports

🧪 Testing Requirements

Unit Tests

All new features must include tests:

# tests/test_ensemble.py
import pytest
from WORLDSTRAT_ENSEMBLE_CORRECTED import compute_ensemble_weights

def test_equal_weights_small_diff():
    """Test that equal weights used when PSNR diff < 0.3 dB."""
    w1, w2 = compute_ensemble_weights(29.5, 29.6, strategy='auto')
    assert abs(w1 - 0.5) < 0.01
    assert abs(w2 - 0.5) < 0.01

def test_softmax_weights_moderate_diff():
    """Test softmax weighting for moderate PSNR difference."""
    w1, w2 = compute_ensemble_weights(29.8, 29.2, strategy='auto')
    assert 0.6 < w1 < 0.7  # Stronger model gets more weight
    assert 0.3 < w2 < 0.4

Run tests:

pytest tests/ -v --cov=. --cov-report=html

Integration Tests

Test full pipeline on small dataset:

# Runs inference on 5 validation images
python tests/test_integration.py

Pass Criteria: PSNR > 28.0 dB, no exceptions, output files valid.


🔄 Pull Request Process

1. Create Feature Branch

git checkout -b feature/add-tta-augmentation
# Or: fix/cuda-oom-batch4, docs/improve-readme, refactor/clean-dataset-class

2. Make Changes

  • Write code following Coding Standards
  • Add tests for new functionality
  • Update documentation (docstrings, README if needed)

3. Test Locally

# Run all checks
black --check *.py
flake8 *.py
pytest tests/
python tests/test_integration.py

4. Commit Changes

See Commit Message Conventions

5. Push and Create PR

git push origin feature/add-tta-augmentation

Then create PR on GitHub with template:

## Description
Adds test-time augmentation (8-fold: rotations + flips) to improve ensemble PSNR.

## Changes
- Added `apply_tta()` function in Cell 10
- Updated inference loop to aggregate TTA predictions
- Added `--enable-tta` CLI flag

## Performance Impact
- PSNR improvement: +0.3 dB on validation set
- Inference time: 8x slower (expected)

## Testing
- [x] Passes `pytest tests/`
- [x] Ran integration test on 10 images
- [x] Verified TTA disabled by default (backward compatible)

## Checklist
- [x] Code follows PEP 8
- [x] Added docstrings
- [x] Updated README with TTA usage
- [x] No large files committed

📝 Commit Message Conventions

Use Conventional Commits format:

<type>(<scope>): <subject>

<body>

<footer>

Types

  • feat: New feature
  • fix: Bug fix
  • docs: Documentation only
  • style: Formatting, no code change
  • refactor: Code restructuring
  • perf: Performance improvement
  • test: Adding tests
  • chore: Build, CI, dependencies

Examples

Good commits:

feat(ensemble): add test-time augmentation with 8-fold rotation/flip

Implements TTA to boost PSNR by averaging predictions over 8 augmented
versions of each input. Adds --enable-tta flag (default: False).

Closes #42
fix(dataset): handle NaN values in normalization

Clips normalized tensors to [0,1] and replaces NaN with 0.0 to prevent
PSNR computation errors.

Fixes #38

Bad commits:

update code          # Too vague
fixed bug            # What bug?
WIP                  # Never commit WIP to main
asdf                 # Meaningless

👀 Code Review Guidelines

As a Reviewer

Check for:

  • Code correctness (logic, edge cases)
  • Performance impact (added overhead?)
  • Readability (clear variable names, comments)
  • Tests coverage (new code tested?)
  • Documentation (docstrings updated?)
  • Backward compatibility (breaks existing workflows?)

Review Tone:

  • ✅ "Consider using torch.einsum here for clarity"
  • ✅ "Great optimization! Could you add a comment explaining the magic number 2.0?"
  • ❌ "This is terrible code, rewrite it"
  • ❌ "Why didn't you use X?" (without explanation)

As an Author

Responding to feedback:

  • Address each comment (fix, explain, or discuss)
  • Mark resolved threads
  • Push fixup commits or amend (if small)
  • Don't take feedback personally—we all learn!

⚠️ Large Files & Weights

Warning

Do NOT commit large files directly to the repository.

What to avoid

  • *.pth (model weights, 50-500MB)
  • *.tif (satellite images, 3-10MB each)
  • *.onnx (exported models)

Solutions

Option 1: Add to .gitignore

echo "*.pth" >> .gitignore
echo "*.tif" >> .gitignore
git add .gitignore
git commit -m "chore: ignore large weight and image files"

Option 2: Use Git LFS

# Install Git LFS
git lfs install

# Track large files
git lfs track "*.pth"
git lfs track "*.tif"

# Commit .gitattributes
git add .gitattributes
git commit -m "chore: add Git LFS for large files"

# Now normal git workflow
git add final-models/swin2sr_best.pth
git commit -m "feat: add trained Swin2SR weights"
git push

Option 3: External Hosting Upload to Google Drive / Hugging Face Hub and link in README.

If you already committed large files

# Remove from last commit (if not pushed)
git reset --soft HEAD~1
git reset HEAD final-models/*.pth
git commit -m "feat: update code (weights excluded)"

# If already pushed (⚠️ rewrites history)
git filter-branch --force --index-filter \
  "git rm --cached --ignore-unmatch final-models/*.pth" \
  --prune-empty --tag-name-filter cat -- --all

📚 Additional Resources


⚖️ License

By contributing, you agree that your contributions will be licensed under the project's MIT License.


Thank you for making WorldStrat Ensemble better! 🌍