Skip to content

Conversation

@Dfunk55
Copy link

@Dfunk55 Dfunk55 commented Jan 12, 2026

Add support for the Chroma1-HD model (lodestones/Chroma1-HD), a modified FLUX.1-schnell with DistilledGuidanceLayer for efficient inference.

Key features:

  • DistilledGuidanceLayer: Pre-computes 344 modulations upfront
  • T5-only text encoding (no CLIP required)
  • Support for negative prompts
  • 4-bit and 8-bit quantization
  • Save/load quantized models with mflux-save

New CLI command: mflux-generate-chroma

Usage:
mflux-generate-chroma --prompt "a cat" --steps 40 --output cat.png mflux-generate-chroma -q 4 --prompt "a dog" --output dog.png

Note: LoRA support not yet implemented for Chroma.

Dfunk55 and others added 8 commits January 12, 2026 18:20
Add support for the Chroma1-HD model (lodestones/Chroma1-HD), a modified
FLUX.1-schnell with DistilledGuidanceLayer for efficient inference.

Key features:
- DistilledGuidanceLayer: Pre-computes 344 modulations upfront
- T5-only text encoding (no CLIP required)
- Support for negative prompts
- 4-bit and 8-bit quantization
- Save/load quantized models with mflux-save

New CLI command: mflux-generate-chroma

Usage:
  mflux-generate-chroma --prompt "a cat" --steps 40 --output cat.png
  mflux-generate-chroma -q 4 --prompt "a dog" --output dog.png

Note: LoRA support not yet implemented for Chroma.
- Create ChromaLoRAMapping with targets for joint and single transformer blocks
- Support BFL/Kohya format LoRA weights with QKV split transforms
- Exclude norm layers (norm1.linear, norm1_context.linear, norm.linear)
  that don't exist in Chroma's DistilledGuidanceLayer architecture
- Add lora_paths and lora_scales parameters to Chroma class
- Enable --lora-paths and --lora-scales CLI arguments
- Add 16 unit tests for mapping coverage and exclusions

Tested with semiosphere/the_artist_for_chromaHD (684/684 keys matched)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Add support for Meituan's LongCat-Image model (meituan-longcat/LongCat-Image):

- Implement LongCat transformer architecture with 24 joint blocks and
  12 single blocks using hidden_size=3072 and num_attention_heads=24
- Add Qwen-based text encoder integration via qwen2_vl tokenizer
- Create weight mapping for HuggingFace model conversion
- Add LoRA support for fine-tuning
- Include CLI tool: mflux-generate-longcat
- Add comprehensive tests for transformer, weight loading, LoRA,
  and initializer validation

Model specifications:
- Uses Flow Matching scheduler (no sigma shift)
- 16-channel VAE
- Supports guidance with distilled guidance embedding
- 512 max sequence length

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add support for Black Forest Labs' FLUX.2-schnell model:

- Implement FLUX.2 transformer with 38 double blocks and 58 single blocks
- Add 32-channel VAE with modified scaling factors
- Integrate Mistral3-based text encoder with sliding window attention
  and 32K max position embeddings
- Create weight mapping for HuggingFace model conversion
- Add LoRA support for fine-tuning
- Include CLI tool: mflux-generate-flux2
- Add comprehensive tests for VAE, encoder, weight mapping,
  quantization, and LoRA

Model specifications:
- Uses rectified flow matching scheduler (no sigma shift)
- 32-channel latent space (vs 16 in FLUX.1)
- Mistral3 encoder (vs CLIP + T5 in FLUX.1)
- 256 max sequence length
- Supports 4/8-bit quantization

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add support for Tencent's Hunyuan-DiT v1.2 model:

- Implement Hunyuan-DiT transformer architecture with 28 DiT blocks
  using hidden_size=1408 and num_attention_heads=16
- Add dual text encoder system (Chinese BERT + T5-XXL) via
  HunyuanPromptEncoder
- Implement DDPM scheduler for diffusion process
- Add num_dit_blocks() method to LoadedWeights for counting
  Hunyuan-style transformer blocks
- Create weight mapping for HuggingFace model conversion
- Add LoRA support for fine-tuning
- Include CLI tool: mflux-generate-hunyuan
- Add comprehensive tests for DiT blocks, DDPM scheduler,
  text encoding, weight loading, and LoRA

Model specifications:
- Uses DDPM scheduler (1000 training steps)
- Supports CFG with Chinese/English prompts
- 256 max sequence length
- Supports 4/8-bit quantization

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add support for NewBie-AI's NewBie-image model (NewBie-AI/NewBie-image-Exp0.1):

- Implement NextDiT transformer architecture with 36 blocks using
  hidden_size=2560 and Grouped Query Attention (24 query heads, 8 KV heads)
- Add dual text encoder system:
  - Gemma3-4B-it for semantic understanding (2560 dim)
  - Jina CLIP v2 for image-text alignment (1024 dim)
- Create weight mapping for HuggingFace model conversion
- Add LoRA support for fine-tuning
- Include CLI tool: mflux-generate-newbie
- Add comprehensive tests for configuration, generation, and LoRA

Model specifications:
- 3.5B parameter model optimized for anime/illustration generation
- Uses Flow Matching scheduler (no sigma shift)
- 16-channel VAE (FLUX.1-dev compatible)
- 512 max sequence length
- Supports 4/8-bit quantization

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Fix save.py: Import Hunyuan (main model class) instead of HunyuanDiT
  (transformer class) which was causing TypeError
- Fix model_config.py: Use FLUX.2-dev (which exists) instead of
  FLUX.2-schnell (which doesn't exist on HuggingFace)
- Update FLUX.2 aliases and enable guidance support

Co-Authored-By: Claude Opus 4.5 <[email protected]>
The NewBie-image HuggingFace repo only contains text_encoder (Gemma3),
not text_encoder_2 (Jina CLIP). The Jina CLIP projection layers exist
in the transformer weights, but the encoder itself is loaded separately
from jinaai/jina-clip-v2 if needed.

Changes:
- Remove jina_clip_encoder from weight definition components
- Remove jina_clip from tokenizer definitions
- Update download patterns to exclude text_encoder_2
- Make jina_clip_encoder optional in initializer (set to None)
- Skip jina_clip_encoder in weight application if None

This fixes FileNotFoundError when loading NewBie-AI/NewBie-image-Exp0.1.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant