A First-order Generative Bilevel Optimization Framework for Diffusion Models

Introduction

In this work, we introduce a unified bilevel optimization framework for hyperparameter optimization in diffusion models. By formulating diffusion sampling as the lower-level problem and treating hyperparameters (e.g., entropy strength or noise schedule) as upper-level variables, our approach enables principled, gradient-based optimization of hyperparameters during both pre-training and fine-tuning. Specifically:

Reward Fine-Tuning (Section 3.1): We propose a bilevel method to tune entropy regularization using CLIP scores, aiming to prevent reward (aesthetic scores) over-optimization during fine-tuning.
Noise Scheduling in Pre-training (Section 3.2): We propose a bilevel method to automatically learn the optimal noise schedule of diffusion model to enhance the generated image quality.

This repository provides modular implementations for both tasks and includes the code to reproduce all experiments from our ICML 2025 paper:

A First-order Generative Bilevel Optimization Framework for Diffusion Models
Quan Xiao, Hui Yuan, A. F. M. Saif, Gaowen Liu, Ramana Kompella, Mengdi Wang, Tianyi Chen

📂 Repository Structure

├── noise_schedule/            # Section 3.2: Noise scheduling experiments on MNIST
│   ├── train_mnist.py/          
│   ├── train_mnist_bayesian.py/         
│   └── train_mnist_bilevel_ZO.py/   
├── entropy_fine-tuning/                 # Section 3.1: Reward fine-tuning modules
│   ├── main.py      # Training for bilevel reward fine-tuning
│   ├── reward_config.yaml      # Default hyperparameters for Section 3.1
│   ├── scorer_bi.py     # Reward functions
    └── gradguided_sdpipeline.py/        # Checkpoints and generated samples
├── requirements.txt      # Python dependencies
└── README.md             # This overview

🛠️ Installation

Ensure you have Python 3.8+ and install the required packages:

pip install -r requirements.txt

Dependencies include:

PyTorch (>=1.10)
diffusers
transformers
torchvision
tqdm
pyyaml
CLIP (for reward models)

⚙️ Usage

Section 3.1: Reward Fine-Tuning (`entropy_fine-tuning`)

Reproduce the bilevel reward fine-tuning from Section 3.1:

cd entropy_fine-tuning
python scripts/reward_finetune.py \
  --config configs/reward_config.yaml \
  --pretrained_model_path "CompVis/stable-diffusion-v1-5" \
  --lower_reward_model models/resnet18_reward.pth \
  --upper_reward clip \
  --num_samples 256 \
  --lambda_init 0.01 \
  --gamma 1e3 \
  --steps 500 \
  --output_dir outputs/reward_finetune

Outputs are stored under saif/outputs/reward_finetune:

lambda_final.txt: learned entropy strength λ
samples/: generated images under the fine-tuned policy
logs/: training curves and gradient trajectories

Section 3.2: Noise Scheduling (`noise_schedule`)

Run the bilevel noise schedule optimization for Section 3.2 on the MNIST dataset:

cd noise_schedule
python train_mnist_bilevel_ZO.py \
  --inner_loop 10 \
  --inner_loop_z 1 \
  --gamma 1 \
  --gamma_end 1 \
  --lr_beta 0.05 0.05 1 0 \

optimized noise schedule parameters
sampled trajectories and denoised images
training logs for upper- and lower-level objectives

📖 Algorithmic Summary

Both modules implement our generative bilevel framework:

Lower-level: SDE-based sampling adjusted by a surrogate loss (entropy regularized reward function or score matching loss).
Upper-level: First-order update of the regularization strength (λ) or schedule parameters via zeroth-order gradient estimates.

Refer to Algorithms 2 & 6 in the paper for detailed pseudo‑code and mathematical derivations.

📑 Citation

If you find our work interesting, please consider citing this paper:

@inproceedings{xiao2025first,
  title={A First-order Generative Bilevel Optimization Framework for Diffusion Models},
  author={Xiao, Quan and Yuan, Hui and Saif, AFM and Liu, Gaowen and Kompella, Ramana and Wang, Mengdi and Chen, Tianyi},
  booktitle={Proceedings of the 42nd International Conference on Machine Learning},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A First-order Generative Bilevel Optimization Framework for Diffusion Models

Introduction

📂 Repository Structure

🛠️ Installation

⚙️ Usage

Section 3.1: Reward Fine-Tuning (`entropy_fine-tuning`)

Section 3.2: Noise Scheduling (`noise_schedule`)

📖 Algorithmic Summary

📑 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
entropy_fine-tuning		entropy_fine-tuning
noise_schedule		noise_schedule
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

A First-order Generative Bilevel Optimization Framework for Diffusion Models

Introduction

📂 Repository Structure

🛠️ Installation

⚙️ Usage

Section 3.1: Reward Fine-Tuning (entropy_fine-tuning)

Section 3.2: Noise Scheduling (noise_schedule)

📖 Algorithmic Summary

📑 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Section 3.1: Reward Fine-Tuning (`entropy_fine-tuning`)

Section 3.2: Noise Scheduling (`noise_schedule`)

Packages