GitHub - Samay10/diffusion-v1.0: This project implements a conditional image-to-image diffusion model using a custom U-Net. It learns the image patterns using noise distributions over time. Features include cosine beta scheduling, data augmentation, and progressive denoising for high-quality image generation.

This project presents a conditional image-to-image diffusion model designed to transform sunrise images into their corresponding sunset variants. It is implemented in PyTorch and built upon a custom U-Net architecture, integrating time embeddings and skip connections for robust denoising. The model is trained on paired image data and leverages the Denoising Diffusion Probabilistic Model (DDPM) framework to learn the reverse process of image degradation.

🌐 Project Description

Diffusion models have emerged as a powerful family of generative models that iteratively refine noisy images to generate realistic samples. This implementation focuses on conditional generation, where the model uses a guiding input image (sunrise) to predict a desired target image (sunset).

The process involves gradually adding noise to a target image over a series of time steps and then training the model to predict and remove that noise, conditioned on the original guiding image. Over time, the model learns the reverse diffusion process—essentially, how to go from pure noise back to the original image, guided by context.

🧠 Technical Details

Model Architecture

Custom U-Net Backbone: Encoder-decoder with skip connections.
Conditioning: Input image and guiding condition are concatenated along the channel dimension.
Time Embeddings: Injected into the bottleneck via MLPs and projected into spatial dimensions.
Normalization: BatchNorm used throughout.
Activation: ReLU in encoder/decoder and SiLU for time embedding.

Diffusion Process

Timesteps: 1000 steps using cosine beta schedule for smoother transitions.
Noise Schedule: Cosine schedule improves stability over linear betas.
Forward Process (q_sample): Adds Gaussian noise to images at each timestep.
Reverse Process (denoising): Model predicts the added noise at each timestep.

Loss Function

Combines MSE (L2) and MAE (L1) losses for better convergence and finer details.
Loss = MSE(pred, noise) + 0.1 * MAE(pred, noise)

Training Strategy

Data Augmentation: Random rotations, perspective transforms, and color jitter applied equally to input-output pairs.
Optimizer: AdamW with lr=1e-4
Scheduler: StepLR reduces LR by 0.5 every 100 epochs.
Gradient Clipping: Max norm of 1.0 to stabilize training.
Epochs: Default 50 (configurable).
Checkpointing: Models saved periodically.

🔍 Dataset

The dataset consists of paired sunrise and sunset images. During training, both are transformed and augmented consistently. A custom ImagePairDataset class is used to handle loading and preprocessing.

🏁 Generation Process

To generate a sunset-style image:

Begin with random noise.
Iterate through reverse timesteps.
At each step:
- Predict noise using the model.
- Refine the image using the predicted noise.
- Optionally save intermediate images.

The final output is rescaled to [0, 1] and saved for visualization.

📂 Output Artifacts

diffusion_outputs/: Intermediate and final images from the generation process.
training_samples/: Model outputs after each epoch.
training_loss.png: Loss graph for training performance.
model_checkpoints/: Checkpointed .pth model files.

🔧 Requirements

Python 3.7+
PyTorch
torchvision
matplotlib
PIL
tqdm

✨ Potential Improvements

Use attention mechanisms (e.g., self/cross-attention).
Experiment with DDIM or Latent Diffusion.
Switch to Patch-based Conditioning for localized translation.
Add CLIP-based losses for semantic alignment.

📌 Use Case

This model is suitable for tasks like:

Artistic image translation (e.g., style or time-of-day transfer)
Data augmentation for vision models
Educational use in understanding generative diffusion models

🔍 Example Usage

python diffusionmodel.py

The script trains the model and periodically generates sample outputs.

Built with ❤️ using PyTorch. Contributions and forks welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
diffusionmodel.py		diffusionmodel.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌐 Project Description

🧠 Technical Details

Model Architecture

Diffusion Process

Loss Function

Training Strategy

🔍 Dataset

🏁 Generation Process

📂 Output Artifacts

🔧 Requirements

✨ Potential Improvements

📌 Use Case

🔍 Example Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌐 Project Description

🧠 Technical Details

Model Architecture

Diffusion Process

Loss Function

Training Strategy

🔍 Dataset

🏁 Generation Process

📂 Output Artifacts

🔧 Requirements

✨ Potential Improvements

📌 Use Case

🔍 Example Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages