Skip to content

LINs-lab/UCGM

Repository files navigation

UCGM: Unified Continuous Generative Models

Peng Sun1,2·   Yi Jiang2·Tao Lin1    

1Westlake University   2Zhejiang University 

🤖 Models📄 Paper🏷️ BibTeX

PWC PWC

Official PyTorch implementation of UCGM Trainer and Sampler (UCGM-{T,S}): 🏆 A unified framework for training, sampling, and understanding continuous generative models (including diffusion, flow-matching, consistency models).

Generated samples from two 675M diffusion transformers trained with UCGM on ImageNet-1K 512×512.
Left: A multi-step model (Steps=NFE=40, FID=1.48) | Right: A few-step model (Steps=NFE=2, FID=1.75)
Samples generated without classifier-free guidance or other guidance techniques.

✨ Features

🚀 Plug-and-Play Acceleration: UCGM-S boosts various pre-trained multi-step continuous models for free—e.g., given a model from REPA-E (on ImageNet 256×256):

  • Cuts 84% of sampling steps (Steps=250 → Steps=40) while improving FID (1.26 → 1.06)
    Training-free and no additional cost introduced

  • 📊 Extended results for more accelerated models are available here.

Lightning-Fast Model Tuning: UCGM-T transforms any pre-trained multi-step continuous model (e.g., REPA-E with FID=1.54 at NFE=80) into a high-performance, few-step generator with record efficiency:

  • FID=1.39 @ Steps=NFE=2 (ImageNet-1K 256×256)
    Tuned in just 8 minutes on 8 GPUs

  • 📊 Extended results for additional tuned models are available here.

🔥 Efficient Unified Framework: Train/sample diffusion, flow-matching, and consistency models in one system, outperforming peers at low steps:

  • FID=1.21 @ Steps=NFE=30 (ImageNet 256×256), 1.48 FID @ Steps=NFE=40 on 512×512
    ✅ Just 2 steps? Still strong (1.42 FID on 256×256, 1.75 FID on 512×512)
    ✅ No classifier-free guidance or other techniques—simpler and faster
    ✅ Compatible with diverse datasets (ImageNet, CIFAR, etc.) and architectures (CNNs, Transformers)—high flexibility

  • 📊 Extended results for additional trained models are available here.

📖 Check more detailed features in our paper!

⚙️ Preparation

  1. Download necessary files from Huggingface, including:

    • Checkpoints of various VAEs
    • Statistic files for datasets
    • Reference files for FID calculation
  2. Place the downloaded outputs and buffers folders at the same directory level as this README.md

  3. For dataset preparation (skip if not training models), run:

bash scripts/data/in1k256.sh

🚀 UCGM-S: Plug-and-Play Acceleration

Accelerate any continuous generative model (diffusion, flow-matching, etc.) with UCGM-S. Results marked with 🚀 denote UCGM-S acceleration.
NFE = Number of Function Evaluations (sampling computation cost)

Method Model Size Dataset Resolution NFE FID NFE (🚀) FID (🚀) Model
REPA-E 675M ImageNet 256×256 250×2 1.26 40×2 1.06 Link
Lightning-DiT 675M ImageNet 256×256 250×2 1.35 50×2 1.21 Link
DDT 675M ImageNet 256×256 250×2 1.26 50×2 1.27 Link
EDM2-S 280M ImageNet 512×512 63 2.56 40 2.53 Link
EDM2-L 778M ImageNet 512×512 63 2.06 50 2.04 Link
EDM2-XXL 1.5B ImageNet 512×512 63 1.91 40 1.88 Link
DDT 675M ImageNet 512×512 250×2 1.28 150×2 1.18 Link

💻 Usage Examples: Generate images and evaluate FID using a REPA-E trained model:

# Generate samples using public pretrained multi-step model
bash scripts/run_eval.sh ./configs/sampling_multi_steps/in1k256_sit_xl_repae_linear.yaml

⚡ UCGM-T: Ultra-Efficient Tuning System

UCGM-T revolutionizes multi-step generative models (including diffusion and flow matching models) by enabling ultra-efficient conversion to high-performance few-step versions. Results marked with ⚡ indicate UCGM-T-tuned models.

Pre-trained Model Model Size Dataset Resolution Tuning Efficiency NFE (⚡) FID (⚡) Tuned Model
Lightning-DiT 675M ImageNet 256×256 0.64 epoch (10 mins) 2 2.06 Link
REPA 675M ImageNet 256×256 0.64 epoch (13 mins) 2 1.95 Link
REPA-E 675M ImageNet 256×256 0.40 epoch (8 mins) 2 1.39 Link
DDT 675M ImageNet 256×256 0.32 epoch (11 mins) 2 1.90 Link

(Please note that the tuning time mentioned above is based on evaluation using 8 H800 GPUs)

💻 Usage Examples

Generate Images:

# Generate samples using our tuned few-step model
bash scripts/run_eval.sh ./configs/tuning_few_steps/in1k256_sit_xl_repae_linear.yaml

Tune Models:

# Tune a multi-step model into few-step version
bash scripts/run_train.sh ./configs/tuning_few_steps/in1k256_sit_xl_repae_linear.yaml

🔥 UCGM-{T,S}: Efficient Unified Framework

Train multi-step and few-step models (diffusion, flow-matching, consistency) with UCGM-T. All models sample efficiently using UCGM-S without guidance.

Encoders Model Size Resolution Dataset NFE FID Model
VA-VAE 675M 256×256 ImageNet 30 1.21 Link
VA-VAE 675M 256×256 ImageNet 2 1.42 Link
DC-AE 675M 512×512 ImageNet 40 1.48 Link
DC-AE 675M 512×512 ImageNet 2 1.75 Link

💻 Usage Examples

Generate Images:

# Generate samples using our pretrained few-step model
bash scripts/run_eval.sh ./configs/training_few_steps/in1k256_tit_xl_vavae.yaml

Train Models:

# Train a new multi-step model (full training)
bash scripts/run_train.sh ./configs/training_multi_steps/in1k512_tit_xl_dcae.yaml

# Convert to few-step model (requires pretrained multi-step checkpoint)
bash scripts/run_train.sh ./configs/training_few_steps/in1k512_tit_xl_dcae.yaml

Note for few-step training:

  1. Requires initialization from a multi-step checkpoint
  2. Prepare your checkpoint file with both model and ema keys:
    {
      "model": multi_step_ckpt["ema"], 
      "ema": multi_step_ckpt["ema"]
    }

🏷️ Bibliography

If you find this repository helpful for your project, please consider citing our work:

@article{sun2025unified,
  title = {Unified continuous generative models},
  author = {Sun, Peng and Jiang, Yi and Lin, Tao},
  journal = {arXiv preprint arXiv:2505.07447},
  year = {2025},
  url = {https://arxiv.org/abs/2505.07447},
  archiveprefix = {arXiv},
  eprint = {2505.07447},
  primaryclass = {cs.LG}
}

📄 License

Apache License 2.0 - See LICENSE for details.

About

[Preprint] UCGM: Unified Continuous Generative Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published