
Alexander Swerdlow1*
Mihir Prabhudesai1*
Siddharth Gandhi1
Deepak Pathak1
Katerina Fragkiadaki1
1 Carnegie Mellon University
The UniDisc checkpoints are available on Hugging Face:
To install the dependencies, run:
git submodule update --init --recursive
uv sync --no-group dev
uv sync
For a more detailed installation guide, please refer to INSTALL.md.
See DATA.md for details on how to download and preprocess the datasets. We provide processing scripts and instructions for all of the used datasets. Additionally, we release a synthetic dataset available here and the corresponding generation scripts as well as the raw data.
See TRAIN.md for training commands.
Interactive demo:
mkdir -p ./ckpts/unidisc_interleaved
huggingface-cli download aswerdlow/unidisc_interleaved --local-dir ./ckpts/unidisc_interleaved
uv run demo/server.py experiments='[large_scale_train,large_scale_train_high_res_interleaved,eval_unified,large_scale_high_res_interleaved_inference]' trainer.load_from_state_dict="./ckpts/unidisc_interleaved/unidisc_interleaved.pt"
uv run demo/client.py
See TRAINING.md for details.
See EVAL.md for details.
To cite our work, please use the following:
@article{swerdlow2025unidisc,
title = {Unified Multimodal Discrete Diffusion},
author = {Swerdlow, Alexander and Prabhudesai, Mihir and Gandhi, Siddharth and Pathak, Deepak and Fragkiadaki, Katerina},
journal = {arXiv preprint arXiv:2503.20853},
year = {2025},
doi = {10.48550/arXiv.2503.20853},
}
This repository is built on top of the following repositories: