feat: Add Chroma1-HD model support #319

Dfunk55 · 2026-01-12T23:30:29Z

Add support for the Chroma1-HD model (lodestones/Chroma1-HD), a modified FLUX.1-schnell with DistilledGuidanceLayer for efficient inference.

Key features:

DistilledGuidanceLayer: Pre-computes 344 modulations upfront
T5-only text encoding (no CLIP required)
Support for negative prompts
4-bit and 8-bit quantization
Save/load quantized models with mflux-save

New CLI command: mflux-generate-chroma

Usage:
mflux-generate-chroma --prompt "a cat" --steps 40 --output cat.png mflux-generate-chroma -q 4 --prompt "a dog" --output dog.png

Note: LoRA support not yet implemented for Chroma.

Add support for the Chroma1-HD model (lodestones/Chroma1-HD), a modified FLUX.1-schnell with DistilledGuidanceLayer for efficient inference. Key features: - DistilledGuidanceLayer: Pre-computes 344 modulations upfront - T5-only text encoding (no CLIP required) - Support for negative prompts - 4-bit and 8-bit quantization - Save/load quantized models with mflux-save New CLI command: mflux-generate-chroma Usage: mflux-generate-chroma --prompt "a cat" --steps 40 --output cat.png mflux-generate-chroma -q 4 --prompt "a dog" --output dog.png Note: LoRA support not yet implemented for Chroma.

- Create ChromaLoRAMapping with targets for joint and single transformer blocks - Support BFL/Kohya format LoRA weights with QKV split transforms - Exclude norm layers (norm1.linear, norm1_context.linear, norm.linear) that don't exist in Chroma's DistilledGuidanceLayer architecture - Add lora_paths and lora_scales parameters to Chroma class - Enable --lora-paths and --lora-scales CLI arguments - Add 16 unit tests for mapping coverage and exclusions Tested with semiosphere/the_artist_for_chromaHD (684/684 keys matched) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>

Add support for Meituan's LongCat-Image model (meituan-longcat/LongCat-Image): - Implement LongCat transformer architecture with 24 joint blocks and 12 single blocks using hidden_size=3072 and num_attention_heads=24 - Add Qwen-based text encoder integration via qwen2_vl tokenizer - Create weight mapping for HuggingFace model conversion - Add LoRA support for fine-tuning - Include CLI tool: mflux-generate-longcat - Add comprehensive tests for transformer, weight loading, LoRA, and initializer validation Model specifications: - Uses Flow Matching scheduler (no sigma shift) - 16-channel VAE - Supports guidance with distilled guidance embedding - 512 max sequence length Co-Authored-By: Claude Opus 4.5 <[email protected]>

Add support for Black Forest Labs' FLUX.2-schnell model: - Implement FLUX.2 transformer with 38 double blocks and 58 single blocks - Add 32-channel VAE with modified scaling factors - Integrate Mistral3-based text encoder with sliding window attention and 32K max position embeddings - Create weight mapping for HuggingFace model conversion - Add LoRA support for fine-tuning - Include CLI tool: mflux-generate-flux2 - Add comprehensive tests for VAE, encoder, weight mapping, quantization, and LoRA Model specifications: - Uses rectified flow matching scheduler (no sigma shift) - 32-channel latent space (vs 16 in FLUX.1) - Mistral3 encoder (vs CLIP + T5 in FLUX.1) - 256 max sequence length - Supports 4/8-bit quantization Co-Authored-By: Claude Opus 4.5 <[email protected]>

Add support for Tencent's Hunyuan-DiT v1.2 model: - Implement Hunyuan-DiT transformer architecture with 28 DiT blocks using hidden_size=1408 and num_attention_heads=16 - Add dual text encoder system (Chinese BERT + T5-XXL) via HunyuanPromptEncoder - Implement DDPM scheduler for diffusion process - Add num_dit_blocks() method to LoadedWeights for counting Hunyuan-style transformer blocks - Create weight mapping for HuggingFace model conversion - Add LoRA support for fine-tuning - Include CLI tool: mflux-generate-hunyuan - Add comprehensive tests for DiT blocks, DDPM scheduler, text encoding, weight loading, and LoRA Model specifications: - Uses DDPM scheduler (1000 training steps) - Supports CFG with Chinese/English prompts - 256 max sequence length - Supports 4/8-bit quantization Co-Authored-By: Claude Opus 4.5 <[email protected]>

Add support for NewBie-AI's NewBie-image model (NewBie-AI/NewBie-image-Exp0.1): - Implement NextDiT transformer architecture with 36 blocks using hidden_size=2560 and Grouped Query Attention (24 query heads, 8 KV heads) - Add dual text encoder system: - Gemma3-4B-it for semantic understanding (2560 dim) - Jina CLIP v2 for image-text alignment (1024 dim) - Create weight mapping for HuggingFace model conversion - Add LoRA support for fine-tuning - Include CLI tool: mflux-generate-newbie - Add comprehensive tests for configuration, generation, and LoRA Model specifications: - 3.5B parameter model optimized for anime/illustration generation - Uses Flow Matching scheduler (no sigma shift) - 16-channel VAE (FLUX.1-dev compatible) - 512 max sequence length - Supports 4/8-bit quantization Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Fix save.py: Import Hunyuan (main model class) instead of HunyuanDiT (transformer class) which was causing TypeError - Fix model_config.py: Use FLUX.2-dev (which exists) instead of FLUX.2-schnell (which doesn't exist on HuggingFace) - Update FLUX.2 aliases and enable guidance support Co-Authored-By: Claude Opus 4.5 <[email protected]>

The NewBie-image HuggingFace repo only contains text_encoder (Gemma3), not text_encoder_2 (Jina CLIP). The Jina CLIP projection layers exist in the transformer weights, but the encoder itself is loaded separately from jinaai/jina-clip-v2 if needed. Changes: - Remove jina_clip_encoder from weight definition components - Remove jina_clip from tokenizer definitions - Update download patterns to exclude text_encoder_2 - Make jina_clip_encoder optional in initializer (set to None) - Skip jina_clip_encoder in weight application if None This fixes FileNotFoundError when loading NewBie-AI/NewBie-image-Exp0.1. Co-Authored-By: Claude Opus 4.5 <[email protected]>

Dfunk55 and others added 8 commits January 12, 2026 18:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add Chroma1-HD model support #319

feat: Add Chroma1-HD model support #319

Uh oh!

Dfunk55 commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: Add Chroma1-HD model support #319

Are you sure you want to change the base?

feat: Add Chroma1-HD model support #319

Uh oh!

Conversation

Dfunk55 commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant