SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows

Qinyu Zhao^1,2 · Guangting Zheng² · Tao Yang² · Rui Zhu^2† · Xingjian Leng¹ · Stephen Gould¹ · Liang Zheng¹

¹ Australian National University ² ByteDance Seed
^† Project Lead

Overview

We train an NF and a VAE in an end-to-end way from scratch. There is no stop-gradient operator, significantly simplifying prior frameworks. The gray modules with the snowflake icon are frozen during training, while colored modules are trained. Solid arrows indicate the forward pass, while dashed arrows denote gradient flows.

On ImageNet 256x256, our end-to-end training framework achieves significantly better generation quality than previous state-of-the-art NF model (STARFlow) with much fewer training epochs.

News and Updates

[2025-12-15] Initial Release with Codebase.

Getting Started

Environment Setup

To set up our environment, please run:

git clone https://github.com/ByteDance-Seed/SimFlow.git
cd SimFlow
# If you use conda, please uncomment the following lines.
# conda create -n simflow python=3.11.2 -y
# conda activate simflow
pip install -r requirements.txt

Train SimFlow

Please download and extract the training split of the ImageNet-1K dataset.

A sample code for training SimFlow+REPA-E is shown below.

torchrun --nproc_per_node=8 --nnodes=2 --node_rank=${NODE_RANK} --master_addr=${MASTER_ADDR} --master_port=${MASTER_PORT} \
train_vae_w_nf.py \
--seed=0 \
--data_path="./imagenet" \
--output_dir="./output/vae_f16d64_std0_5_simflow_adaln_2222246_repaAlign3/" \
--resume="./output/vae_f16d64_std0_5_simflow_adaln_2222246_repaAlign3/" \
--batch_size=16 \
--checkpointing-steps=100000 --sampling-steps 5000 \
--loss-cfg-path="configs/vae_loss/l1_lpips_kl_gan_joint_training.yaml" \
--vae="vae_f16d64" --use_variational=False --fixed_std=0.5 \
--channels=1152 --blocks=6 --layers_per_block=2,2,2,2,2,46 --num_heads=16 \
--lr_schedule='const_then_cosine' --warmup_epochs=0 --hold_epochs=80 --min_lr=1e-6 --epochs=160 --max-train-steps=800000 \
--enc_type="dinov2-vit-b" --repa_align_depth='-1,-1,1,-1,-1,-1' \
--disturb_latents='none' \
--online_eval --eval_steps=100000 --cfg=0.0

Evaluation

ImageNet 256x256 | FID = 2.15

torchrun --nproc_per_node=8 eval_vae_w_nf.py \
--seed=0 --output_dir="output/simflow_imagenet256x256" \
--resume="output/simflow_imagenet256x256" \
--vae="vae_f16d64" --use_variational=False --fixed_std=0.5 \
--channels=1152 --blocks=6 --layers_per_block=2,2,2,2,2,46 --num_heads=16 \
--evaluate --cfg=1.1 --temperature=0.95 --denoising_lr=0.25

ImageNet 256x256 with REPA-E | FID = 1.91

torchrun --nproc_per_node=8 eval_vae_w_nf.py \
--seed=0 --output_dir="output/simflow_imagenet256x256_repae" \
--resume="output/simflow_imagenet256x256_repae" \
--vae="vae_f16d64" --use_variational=False --fixed_std=0.5 \
--channels=1152 --blocks=6 --layers_per_block=2,2,2,2,2,46 --num_heads=16 \
--evaluate --cfg=1.1 --temperature=0.975 --denoising_lr=0.25

ImageNet 512x512 with REPA-E | FID = 2.74

torchrun --nproc_per_node=8 eval_vae_w_nf.py \
--seed=0 --output_dir="output/simflow_imagenet512x512_repae" \
--resume="output/simflow_imagenet512x512_repae" \
--resolution=512 \
--vae="vae_f16d64" --use_variational=False --fixed_std=0.5 \
--channels=1152 --blocks=6 --layers_per_block=2,2,2,2,2,46 --num_heads=16 \
--evaluate --cfg=1.0 --temperature=1.0 --eval_bsz=64

Pretrained Models

We also provide pretrained models, which can be downloaded on HuggingFace.

Acknowledgement

This codebase builds upon several excellent open-source projects, including:

We sincerely thank the authors for making their work and models publicly available.

Citation

If you find our work useful, please consider citing:

@article{zhao2025simflow,
  title={SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows},
  author={Zhao, Qinyu and Zheng, Guangting and Yang, Tao and Zhu, Rui and Leng, Xingjian and Gould, Stephen and Zheng, Liang},
  journal={arXiv preprint arXiv:2512.04084},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
configs/vae_loss		configs/vae_loss
encoders		encoders
fid_stats		fid_stats
loss		loss
models		models
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
engine.py		engine.py
eval_vae.py		eval_vae.py
eval_vae_w_nf.py		eval_vae_w_nf.py
gather_images.py		gather_images.py
requirements.txt		requirements.txt
train_vae.py		train_vae.py
train_vae_w_nf.py		train_vae_w_nf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows

Overview

News and Updates

Getting Started

Environment Setup

Train SimFlow

Evaluation

Pretrained Models

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows

Overview

News and Updates

Getting Started

Environment Setup

Train SimFlow

Evaluation

Pretrained Models

Acknowledgement

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages