Skip to content

v4.3.5 - boogu-image, prompt2effect, ideogram4 bugfixes, qwen_image compile improvement

Latest

Choose a tag to compare

@bghira bghira released this 18 Jun 23:37
200af5c

Features

  • Boogu Image v0.1 - a 10B transformer model that trains very quickly (1 it/sec) especially on H100 with FA3 and torch regional compile
    • Currently only supports basic training, not CREPA or LayerSync, TwinFlow etc
    • Comes with high-throughput training when torch regional compile is enabled
  • Ideogram4 - validated Apple Silicon training support
  • SDNQ quant levels enabled for training
  • Qwen Image - removed use of complex tensors for improved torch compile support, especially on H100 with FA3
  • Prompt2Effect - an implementation of the paper from Snapchat that likely explains how they make video I2V effect LoRAs so readily

Bugfixes

  • Single-file export for SD/SDXL and Transformers v5.6 or greater no longer crashes
  • Diffusers' double-shift for Euler flow matching scheduler (impacted mostly "unpopular" models)
  • WebUI training launch failure resolved (#2772)

What's Changed

  • ideogram fp8 fix for flowmap tensor on meta device by @bghira in #2757
  • Fix single-file SD/SDXL export with flattened CLIPTextModel (transformers >=5.6) by @ArthurZucker in #2759
  • diffusers: fix sigma bounds after initialising scheduler to prevent double-shift by @bghira in #2760
  • prompt2effect: train a hypernetwork from a collection of effect LoRAs by @bghira in #2761
  • merge by @bghira in #2762
  • boogu-image v0.1 by @bghira in #2763
  • ideogram4: support MacOS training by @bghira in #2764
  • ideogram4: document apple-specific quant via int8-sdnq by @bghira in #2766
  • sdnq: enable int8 training by @bghira in #2765
  • boogu-image: remove broken fp8 paths and use torchao on-the-fly quant instead by @bghira in #2767
  • Avoid complex Boogu rotary ops by @bghira in #2769
  • qwen_image: improved torch compile performance by @bghira in #2770
  • qwen_image: remove deprecation notice by @bghira in #2771
  • Fix training launch sanitization import and add regression coverage by @bghira with @Copilot in #2772
  • merge by @bghira in #2773

New Contributors

Full Changelog: v4.3.4...v4.3.5