Skip to content

TACJu/FlowTok

Repository files navigation

(ICCV 2025) FlowTok: Flowing Seamlessly Across Text and Image Tokens

This repository provides a PyTorch re-implementation of FlowTok for the text-to-image generation task. Compared to the original paper, this implementation extends the generation capability to 512×512 resolution.

FlowTok: Flowing Seamlessly Across Text and Image Tokens

ICCV 2025

Ju He | Qihang Yu | Qihao Liu | Liang-Chieh Chen

[project page] | [paper] | [arxiv]

teaser


Setup

  • Environment

    The code has been tested with PyTorch 2.1.2 and Cuda 12.1.

    An example of installation commands is provided as follows:

    git clone [email protected]:tacju/FlowTok.git
    cd FlowTok
    
    pip3 install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121
    pip3 install -U --pre triton
    pip3 install -r requirements.txt
    

Training FlowTok for T2I

We provide a training script for text-to-image (T2I) generation in train_flowtok.sh.


Terms of use

The project is created for research purposes.


Acknowledgements

This codebase is built upon the following repository:

Much appreciation for their outstanding efforts.


BibTeX

If you use our work in your research, please use the following BibTeX entries.

@article{he2025flowtok,
  author    = {Ju He and Qihang Yu and Qihao Liu and Liang-Chieh Chen},
  title     = {FlowTok: Flowing Seamlessly Across Text and Image Tokens},
  journal   = {ICCV},
  year      = {2025}
}
@article{liu2025crossflow,
  author    = {Qihao Liu and Xi Yin and Alan Yuille and Andrew Brown and Mannat Singh},
  title     = {Flowing from Words to Pixels: A Noise-Free Framework for Cross-Modality Evolution},
  journal   = {CVPR},
  year      = {2025}
}
@article{kim2025democratizing,
  author    = {Dongwon Kim and Ju He and Qihang Yu and Chenglin Yang and Xiaohui Shen and Suha Kwak and Liang-Chieh Chen},
  title     = {Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens},
  journal   = {ICCV},
  year      = {2025}
}
@article{yu2024an,
  author    = {Qihang Yu and Mark Weber and Xueqing Deng and Xiaohui Shen and Daniel Cremers and Liang-Chieh Chen},
  title     = {An Image is Worth 32 Tokens for Reconstruction and Generation},
  journal   = {NeurIPS},
  year      = {2024}
}

About

PyTorch re-implementation of FlowTok: Flowing Seamlessly Across Text and Image Tokens

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published