Please implement ControlNet pipeline and training script for Würstchen.
If Würstchen requires only 12x12 latent space instead of 64x64 compared to Stable Diffusion and this also means a 28x times (4096 / 144) speed-up in training, this would be awesome!
Less thinking, more tinkering!
crosspost: huggingface/diffusers#5071