Beyond and Free from Diffusion: Invertible Guided Consistency Training

Pytorch implementation for Invertible Guided Consistency Training (iGCT) (https://arxiv.org/abs/2502.05391).

We propose invertible Guided Consistency Training (iGCT), a diffusion independent approach that integrates guidance and inversion into Consistency Models. iGCT is a fully data-driven algorithm that enables fast guidance generation, fast inversion, and fast editing.

Environment Setup

To set up the environment, follow these steps:

Using Python Virtual Environment

python -m venv igct  
source igct/bin/activate

Using Conda

conda create -n igct python=3.9 -y
conda activate igct

Install Dependencies

pip install click requests pillow numpy scipy psutil tqdm imageio scikit-image imageio-ffmpeg pyspng
pip install torch==2.3.0+cu121 -f https://download.pytorch.org/whl/torch_stable.html
pip install lpips transformers

Datasets

Ensure a GPU is available for dataset preparation. Follow the instructions below to download and prepare the datasets.

CIFAR-10

Download the CIFAR-10 python version.

Convert the dataset to a ZIP archive:

mkdir downloads && cd downloads
mkdir cifar10 && cd cifar10
curl -O https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz 
cd ../..
python dataset_tool.py --source=downloads/cifar10/cifar-10-python.tar.gz --dest=datasets/cifar10-32x32.zip
python dataset_tool.py --source=downloads/cifar10/cifar-10-python.tar.gz --dest=datasets/cifar10-32x32-test.zip --testset=true

Compute FID statistics:

python fid.py ref --data=datasets/cifar10-32x32.zip --dest=fid-refs/cifar10-32x32.npz

ImageNet

Download the ImageNet Object Localization Challenge.

Convert the dataset to a ZIP archive at 64x64 resolution:

python dataset_tool.py --source=downloads/imagenet/ILSVRC/Data/CLS-LOC/train \
    --dest=datasets/imagenet-64x64.zip --resolution=64x64 --transform=center-crop

Organize the ImageNet validation set directory:

python organize_imagenet_dataset.py --annote_dir downloads/imagenet/ILSVRC/Annotations/CLS-LOC/val --images_dir downloads/imagenet/ILSVRC/Data/CLS-LOC/val

Convert the validation set to a ZIP archive:

python dataset_tool.py --source=downloads/imagenet/ILSVRC/Data/CLS-LOC/val --dest=datasets/imagenet-64x64-val.zip --resolution=64x64 --transform=center-crop

Compute FID statistics:

python fid.py ref --data=datasets/imagenet-64x64.zip --dest=fid-refs/imagenet-64x64.npz

Create ImageNet subgroups for image editing:

python create_imagenet_editing_subgroups.py --dataset_path datasets/imagenet-64x64-val.zip --save_dir datasets/imagenet-64x64-editing-subgroups

Training and Evaluation Scripts

Example scripts for training and evaluation can be found in the ./scripts directory.

Checkpoints

Pre-trained model checkpoints are available for download. The checkpoints are organized as follows:

model_weights/
├── cifar10/
│   ├── baselines/
│   │   ├── cfg-edm/                # Checkpoints for CFG-EDM baseline on CIFAR-10
│   │   └── guided-cd/              # Checkpoints for Guided-CD baseline on CIFAR-10
│   └── igct/
│       └── igct/                   # Checkpoints for iGCT on CIFAR-10
│
└── imagenet/
    ├── baselines/
    │   └── cfg-edm/                # Checkpoints for CFG-EDM baseline on ImageNet
    └── igct/
        └── igct/                   # Checkpoints for iGCT on ImageNet

Download the checkpoints from the model_weights Google Drive link.

Testing Environment

CUDA Information

CUDA Release Version: 12.1
NVCC Compiler Version: V12.1.105

Operating System

Distribution: Red Hat Enterprise Linux (RHEL)
Version: 9.2
Codename: Plow
Release: 9.2

GPU Memory Requirements

CIFAR-10: Cluster of A100 (40GB), ~84 GPU days.
Imagenet64: Cluster of A100 (80GB) ~190 GPU days.

Contact

For questions, feedback, or collaboration opportunities, feel free to reach out via email at [email protected] or [email protected]

Citation

If you find this work useful, please consider citing our paper:

@misc{hsu2025freediffusioninvertibleguided,
      title={Beyond and Free from Diffusion: Invertible Guided Consistency Training}, 
      author={Chia-Hong Hsu and Shiu-hong Kao and Randall Balestriero},
      year={2025},
      eprint={2502.05391},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2502.05391}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
dnnlib		dnnlib
metrics		metrics
scripts		scripts
torch_utils		torch_utils
training		training
.gitignore		.gitignore
README.md		README.md
create_imagenet_editing_subgroups.py		create_imagenet_editing_subgroups.py
dataset_tool.py		dataset_tool.py
fid.py		fid.py
igct_eval.py		igct_eval.py
igct_train.py		igct_train.py
organize_imagenet_dataset.py		organize_imagenet_dataset.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Beyond and Free from Diffusion: Invertible Guided Consistency Training

Environment Setup

Using Python Virtual Environment

Using Conda

Install Dependencies

Datasets

CIFAR-10

ImageNet

Training and Evaluation Scripts

Checkpoints

Testing Environment

CUDA Information

Operating System

GPU Memory Requirements

Contact

Citation

About

Uh oh!

Releases

Packages

Languages

swimmincatt35/iGCT

Folders and files

Latest commit

History

Repository files navigation

Beyond and Free from Diffusion: Invertible Guided Consistency Training

Environment Setup

Using Python Virtual Environment

Using Conda

Install Dependencies

Datasets

CIFAR-10

ImageNet

Training and Evaluation Scripts

Checkpoints

Testing Environment

CUDA Information

Operating System

GPU Memory Requirements

Contact

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages