Skip to content

swimmincatt35/iGCT

Repository files navigation

Beyond and Free from Diffusion: Invertible Guided Consistency Training

Pytorch implementation for Invertible Guided Consistency Training (iGCT) (https://arxiv.org/abs/2502.05391).

We propose invertible Guided Consistency Training (iGCT), a diffusion independent approach that integrates guidance and inversion into Consistency Models. iGCT is a fully data-driven algorithm that enables fast guidance generation, fast inversion, and fast editing.

Teaser Image

Environment Setup

To set up the environment, follow these steps:

Using Python Virtual Environment

python -m venv igct  
source igct/bin/activate

Using Conda

conda create -n igct python=3.9 -y
conda activate igct

Install Dependencies

pip install click requests pillow numpy scipy psutil tqdm imageio scikit-image imageio-ffmpeg pyspng
pip install torch==2.3.0+cu121 -f https://download.pytorch.org/whl/torch_stable.html
pip install lpips transformers

Datasets

Ensure a GPU is available for dataset preparation. Follow the instructions below to download and prepare the datasets.

CIFAR-10

  1. Download the CIFAR-10 python version.
  2. Convert the dataset to a ZIP archive:
    mkdir downloads && cd downloads
    mkdir cifar10 && cd cifar10
    curl -O https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz 
    cd ../..
    python dataset_tool.py --source=downloads/cifar10/cifar-10-python.tar.gz --dest=datasets/cifar10-32x32.zip
    python dataset_tool.py --source=downloads/cifar10/cifar-10-python.tar.gz --dest=datasets/cifar10-32x32-test.zip --testset=true
  3. Compute FID statistics:
    python fid.py ref --data=datasets/cifar10-32x32.zip --dest=fid-refs/cifar10-32x32.npz

ImageNet

  1. Download the ImageNet Object Localization Challenge.
  2. Convert the dataset to a ZIP archive at 64x64 resolution:
    python dataset_tool.py --source=downloads/imagenet/ILSVRC/Data/CLS-LOC/train \
        --dest=datasets/imagenet-64x64.zip --resolution=64x64 --transform=center-crop
  3. Organize the ImageNet validation set directory:
    python organize_imagenet_dataset.py --annote_dir downloads/imagenet/ILSVRC/Annotations/CLS-LOC/val --images_dir downloads/imagenet/ILSVRC/Data/CLS-LOC/val
  4. Convert the validation set to a ZIP archive:
    python dataset_tool.py --source=downloads/imagenet/ILSVRC/Data/CLS-LOC/val --dest=datasets/imagenet-64x64-val.zip --resolution=64x64 --transform=center-crop
  5. Compute FID statistics:
    python fid.py ref --data=datasets/imagenet-64x64.zip --dest=fid-refs/imagenet-64x64.npz
  6. Create ImageNet subgroups for image editing:
    python create_imagenet_editing_subgroups.py --dataset_path datasets/imagenet-64x64-val.zip --save_dir datasets/imagenet-64x64-editing-subgroups

Training and Evaluation Scripts

Example scripts for training and evaluation can be found in the ./scripts directory.

Checkpoints

Pre-trained model checkpoints are available for download. The checkpoints are organized as follows:

model_weights/
├── cifar10/
│   ├── baselines/
│   │   ├── cfg-edm/                # Checkpoints for CFG-EDM baseline on CIFAR-10
│   │   └── guided-cd/              # Checkpoints for Guided-CD baseline on CIFAR-10
│   └── igct/
│       └── igct/                   # Checkpoints for iGCT on CIFAR-10
│
└── imagenet/
    ├── baselines/
    │   └── cfg-edm/                # Checkpoints for CFG-EDM baseline on ImageNet
    └── igct/
        └── igct/                   # Checkpoints for iGCT on ImageNet

Download the checkpoints from the model_weights Google Drive link.

Testing Environment

CUDA Information

  • CUDA Release Version: 12.1
  • NVCC Compiler Version: V12.1.105

Operating System

  • Distribution: Red Hat Enterprise Linux (RHEL)
  • Version: 9.2
  • Codename: Plow
  • Release: 9.2

GPU Memory Requirements

  • CIFAR-10: Cluster of A100 (40GB), ~84 GPU days.
  • Imagenet64: Cluster of A100 (80GB) ~190 GPU days.

Contact

For questions, feedback, or collaboration opportunities, feel free to reach out via email at [email protected] or [email protected]

Citation

If you find this work useful, please consider citing our paper:

@misc{hsu2025freediffusioninvertibleguided,
      title={Beyond and Free from Diffusion: Invertible Guided Consistency Training}, 
      author={Chia-Hong Hsu and Shiu-hong Kao and Randall Balestriero},
      year={2025},
      eprint={2502.05391},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2502.05391}, 
}

About

Pytorch implementation for Invertible Guided Consistency Training (iGCT)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published