A deep learning-based Dual-Image Super-Resolution framework built using a modified EDSR (Enhanced Deep Super-Resolution) architecture. The model takes two temporally shifted low-resolution satellite images and fuses them to generate a high-resolution output. Fusion is achieved via simple channel-wise concatenation followed by deep residual refinement.
This project targets the problem of enhancing the spatial resolution of satellite images captured at different times. Since high-res satellites are costly and limited, we leverage multiple low-resolution inputs captured over time to reconstruct the corresponding high-resolution image.
This approach is especially useful for missions like Proba-V, where the trade-off between spatial and temporal resolution limits imagery quality.
I’ve written a series of Medium blogs that explain this project step-by-step — from dataset preparation to building the model, training it, and visualizing results.
You can read the full journey here:
-
📄 Part 1: Dataset Creation & Preprocessing Covers how the Proba-V satellite dataset was normalized, filtered, and restructured for dual-image SR training.
-
🧠 Part 2: Building the Model Architecture (EDSR + Fusion) Explains the PyTorch model architecture including the use of EDSR blocks, concatenation fusion, and PixelShuffle.
-
📊 Part 3: Training, Evaluation (PSNR), and Inference Discusses training strategy, metrics, loss functions, and how to visualize and interpret the results.
📖 All concepts are explained clearly with examples, visuals, and reasoning, so don’t worry if you’re new to this!
These articles are written in a beginner-friendly tone with technical clarity, so even if you’re just getting started with deep learning or super-resolution, you’ll find them easy to follow.
If you find the articles helpful, feel free to give them a follow or clap on Medium — it helps spread the word and supports open-source learning!
Let me know if you'd like to add image previews or article badges!
This project uses a preprocessed version of the Proba-V Super-Resolution Dataset. The structure is as follows:
dual\_sr\_dataset/
├── train/
│ ├── low\_res/
│ │ ├── imgsetXXXX\_LR0.png
│ │ ├── imgsetXXXX\_LR1.png
│ └── high\_res/
│ ├── imgsetXXXX\_HR.png
├── test/
│ ├── low\_res/
│ └── high\_res/
Each sample consists of:
LR0: Low-res image at time tLR1: Low-res image at time t+1HR: Ground truth high-res image
The model combines dual inputs using channel-wise concatenation after feature extraction and processes the merged tensor through an EDSR-style backbone for refinement.
LR1 LR2
│ │
\[Conv] \[Conv]
│ │
└─> \[Concat + Fusion Block] ──> \[EDSR Backbone] ──> \[Upsample] ──> SR Output
InitialConvBlock: 3×3 conv + ReLU per LR inputFusionBlock: 1×1 conv to reduce channel sizeEDSRBackbone: Stack of Residual Blocks (default: 8)UpsampleBlock: Conv → PixelShuffle to upscale
- Framework: PyTorch
- Input Size: LR = 64×64, HR = 256×256
- Scaling Factor: ×4
- Loss Function: L1 / Combined Loss (L1 + PSNR)
- Optimizer: Adam
- Learning Rate: 1e-4
- Epochs: 50
- Batch Size: 4
Model checkpoints are saved in the checkpoints/ directory.
The performance is evaluated using:
- PSNR (Peak Signal-to-Noise Ratio)
Higher values indicate better reconstruction.
Example: Dual inputs, model output, and ground truth comparison.
Here’s an example of how the model enhances resolution:
Sample outputs and intermediate visualizations are stored in the
sr_visualization/directory.
