Keming Wu1*,
Junwen Chen3*,
Zhanhao Liang2*,
Yinuo Wang1*,
Ji Li5,
Chao Zhang4,
Bin Wang1,
Yuhui Yuan6*
1Tsinghua University
2The Australian National University
3The University of Electro-Communications Tokyo
4Peking University
5Microsoft
6Canva
*Work done at Microsoft Research Asia
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
- [2025/7/20] Repository is initialized.
- [2025/6/26] 🎉🎉🎉 HybridLayout is accepted by ICCV 2025! 🎉🎉🎉
- Release inference code and pretrained model
- Release training code
conda create -n hybrid_layout python=3.10 -y
conda activate hybrid_layoutgit clone https://github.com/KemingWu/HybridLayout.git
cd HybridLayout
pip install uv
uv pip install --pre -U xformers
uv pip install diffusers==0.31.0 transformers==4.44.0
uv pip install mmenginehuggingface-cli loginUse our inference.ipynb to simply have a try
If you have any questions, please feel free to contact Keming Wu and Yuhui Yuan.
If you find this code useful in your research, please consider citing:
@inproceedings{wu2025hybrid,
title={Hybrid Layout Control for Diffusion Transformer: Fewer Annotations, Superior Aesthetics},
author={Wu, Keming and Chen, Junwen and Liang, Zhanhao and Wang, Yinuo and Li, Ji and Zhang, Chao and Wang, Bin and Yuan, Yuhui},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={17930--17940},
year={2025}
}









