Can't use quantized models of schnell, dev, qwen or fibo. Only z-image-turbo works

I'm on a MacBook Pro M1 with macOS 15.
I've installed pyenv, then python 3.11 and made it default globally and created a venv.
Finally I've installed `mflux` (0.13.3)
BTW: uv did not work for me, at least there are more instructions required in the README.md of mlux

I've downloaded the quantized models from hugging face before via their python module.
I've downloaded all the listed models:

- [dhairyashil/FLUX.1-schnell-mflux-v0.6.2-4bit](https://huggingface.co/dhairyashil/FLUX.1-schnell-mflux-v0.6.2-4bit)
- [dhairyashil/FLUX.1-dev-mflux-4bit](https://huggingface.co/dhairyashil/FLUX.1-dev-mflux-4bit)
- [akx/FLUX.1-Kontext-dev-mflux-4bit](https://huggingface.co/akx/FLUX.1-Kontext-dev-mflux-4bit)
- [filipstrand/FLUX.1-Krea-dev-mflux-4bit](https://huggingface.co/filipstrand/FLUX.1-Krea-dev-mflux-4bit)
- [filipstrand/Qwen-Image-mflux-6bit](https://huggingface.co/filipstrand/Qwen-Image-mflux-6bit)
- [filipstrand/Z-Image-Turbo-mflux-4bit](https://huggingface.co/filipstrand/Z-Image-Turbo-mflux-4bit)
- [briaai/Fibo-mlx-4bit](https://huggingface.co/briaai/Fibo-mlx-4bit)
- [briaai/Fibo-mlx-8bit](https://huggingface.co/briaai/Fibo-mlx-8bit)

Then I've linked each model via `ln`, for example `ln -s ~/.cache/huggingface/hub/models--filipstrand--Z-Image-Turbo-mflux-4bit/snapshots/b3a8f31115a11f2f9e2fa0bfbc8d78dcc3e6568b model-links/z-image-turbo-4bit` to my project directory for easier references.

I've tried two ways of generate an image:

### z-image-turbo ✅
```
mflux-generate --model model-links/z-image-turbo-4bit --steps 9  --prompt "Luxury food photograph"
```
That didn't work, I've also tried to set `--base-model z-image-turbo` but still the same error:

```
Traceback (most recent call last):
  File "/Users/johndoe/dev/mflux/.venv/bin/mflux-generate", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/cli/flux_generate.py", line 28, in main
    flux = Flux1(
           ^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/variants/txt2img/flux.py", line 38, in __init__
    FluxInitializer.init(
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/flux_initializer.py", line 37, in init
    weights = FluxInitializer._load_weights(path)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/flux_initializer.py", line 164, in _load_weights
    return WeightLoader.load(
           ^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/weights/loading/weight_loader.py", line 57, in load
    weights, q_level, version = WeightLoader._load_component(root_path, component, raw_weights_cache)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/weights/loading/weight_loader.py", line 99, in _load_component
    raw_weights = WeightLoader._load_safetensors(component_path, component.loading_mode)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/weights/loading/weight_loader.py", line 203, in _load_safetensors
    return WeightLoader._load_mlx_native(path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/weights/loading/weight_loader.py", line 221, in _load_mlx_native
    raise FileNotFoundError(f"No safetensors files found in {path}")
FileNotFoundError: No safetensors files found in model-links/z-image-turbo-4bit/text_encoder_2
```

Then I've tried this syntax:

```
mflux-generate-z-image-turbo --model filipstrand/Z-Image-Turbo-mflux-4bit --steps 9 --prompt "Luxury food photograph"
```

and it worked, yeah!

### flux1.schnell ❌

Then I've tried to use flux.1-schnell

I've used this command:

```
mflux-generate --model model-links/flux.1-schnell-4bit --base-model schnell --steps 2 --seed 2 --prompt "Luxury food photograph"
```

but I got this error:

```
Could not extract SentencePiece model from model-links/flux.1-schnell-4bit/tokenizer_2/spiece.model using sentencepiece library due to
SentencePieceExtractor requires the protobuf library but it was not found in your environment. Check out the instructions on the
installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones
that match your environment. Please note that you may need to restart your runtime after installation.
. Falling back to TikToken extractor.
Traceback (most recent call last):
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/tokenization_utils_tokenizers.py", line 161, in convert_to_native_format
    local_kwargs = SentencePieceExtractor(vocab_file).extract(cls.model, **local_kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/convert_slow_tokenizer.py", line 154, in __init__
    requires_backends(self, "protobuf")
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1844, in requires_backends
    raise ImportError("".join(failed))
ImportError:
SentencePieceExtractor requires the protobuf library but it was not found in your environment. Check out the instructions on the
installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones
that match your environment. Please note that you may need to restart your runtime after installation.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/convert_slow_tokenizer.py", line 1854, in extract_vocab_merges_from_model
    from tiktoken.load import load_tiktoken_bpe
ModuleNotFoundError: No module named 'tiktoken'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/johndoe/dev/mflux/.venv/bin/mflux-generate", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/cli/flux_generate.py", line 28, in main
    flux = Flux1(
           ^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/variants/txt2img/flux.py", line 38, in __init__
    FluxInitializer.init(
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/flux_initializer.py", line 38, in init
    FluxInitializer._init_tokenizers(model, path, model_config)
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/flux_initializer.py", line 171, in _init_tokenizers
    model.tokenizers = TokenizerLoader.load_all(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/tokenizer/tokenizer_loader.py", line 50, in load_all
    tokenizer = TokenizerLoader.load(d, model_path)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/tokenizer/tokenizer_loader.py", line 30, in load
    raw_tokenizer = TokenizerLoader._load_raw_tokenizer(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/tokenizer/tokenizer_loader.py", line 118, in _load_raw_tokenizer
    tokenizer = cls.from_pretrained(
                ^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1757, in from_pretrained
    return cls._from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2006, in _from_pretrained
    init_kwargs = cls.convert_to_native_format(**init_kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/tokenization_utils_tokenizers.py", line 184, in convert_to_native_format
    ).extract_vocab_merges_from_model(vocab_file)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/convert_slow_tokenizer.py", line 1856, in extract_vocab_merges_from_model
    raise ValueError(
ValueError: `tiktoken` is required to read a `tiktoken` file. Install it with `pip install tiktoken`.
```

I've installed tiktoken and tried to run again, but then I got antoher error:

```
Could not extract SentencePiece model from model-links/flux.1-schnell-4bit/tokenizer_2/spiece.model using sentencepiece library due to
SentencePieceExtractor requires the protobuf library but it was not found in your environment. Check out the instructions on the
installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones
that match your environment. Please note that you may need to restart your runtime after installation.
. Falling back to TikToken extractor.
Traceback (most recent call last):
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/tokenization_utils_tokenizers.py", line 161, in convert_to_native_format
    local_kwargs = SentencePieceExtractor(vocab_file).extract(cls.model, **local_kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/convert_slow_tokenizer.py", line 154, in __init__
    requires_backends(self, "protobuf")
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1844, in requires_backends
    raise ImportError("".join(failed))
ImportError:
SentencePieceExtractor requires the protobuf library but it was not found in your environment. Check out the instructions on the
installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones
that match your environment. Please note that you may need to restart your runtime after installation.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/tiktoken/load.py", line 168, in load_tiktoken_bpe
    token, rank = line.split()
    ^^^^^^^^^^^
ValueError: not enough values to unpack (expected 2, got 1)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/johndoe/dev/mflux/.venv/bin/mflux-generate", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/cli/flux_generate.py", line 28, in main
    flux = Flux1(
           ^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/variants/txt2img/flux.py", line 38, in __init__
    FluxInitializer.init(
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/flux_initializer.py", line 38, in init
    FluxInitializer._init_tokenizers(model, path, model_config)
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/flux_initializer.py", line 171, in _init_tokenizers
    model.tokenizers = TokenizerLoader.load_all(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/tokenizer/tokenizer_loader.py", line 50, in load_all
    tokenizer = TokenizerLoader.load(d, model_path)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/tokenizer/tokenizer_loader.py", line 30, in load
    raw_tokenizer = TokenizerLoader._load_raw_tokenizer(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/tokenizer/tokenizer_loader.py", line 118, in _load_raw_tokenizer
    tokenizer = cls.from_pretrained(
                ^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1757, in from_pretrained
    return cls._from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2006, in _from_pretrained
    init_kwargs = cls.convert_to_native_format(**init_kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/tokenization_utils_tokenizers.py", line 184, in convert_to_native_format
    ).extract_vocab_merges_from_model(vocab_file)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/transformers/convert_slow_tokenizer.py", line 1860, in extract_vocab_merges_from_model
    bpe_ranks = load_tiktoken_bpe(tiktoken_url)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/tiktoken/load.py", line 171, in load_tiktoken_bpe
    raise ValueError(f"Error parsing line {line!r} in {tiktoken_bpe_file}") from e
ValueError: Error parsing line b'\x0e' in model-links/flux.1-schnell-4bit/tokenizer_2/spiece.model
```

I've the same error with flux.1-dev

### qwen ❌

This is my command:

```
mflux-generate-qwen --model model-links/qwen-6bit --base-model qwen --prompt "Luxury food photograph" --steps 20
```

and here I get this error:

```
Traceback (most recent call last):
  File "/Users/johndoe/dev/mflux/.venv/bin/mflux-generate-qwen", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/qwen/cli/qwen_image_generate.py", line 51, in main
    image = qwen.generate_image(
            ^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/qwen/variants/txt2img/qwen_image.py", line 85, in generate_image
    prompt_embeds, prompt_mask, negative_prompt_embeds, negative_prompt_mask = QwenPromptEncoder.encode_prompt(
                                                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/qwen/model/qwen_text_encoder/qwen_prompt_encoder.py", line 29, in encode_prompt
    prompt_embeds, prompt_mask = qwen_text_encoder(
                                 ^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/qwen/model/qwen_text_encoder/qwen_text_encoder.py", line 17, in __call__
    hidden_states = self.encoder(input_ids, attention_mask)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/qwen/model/qwen_text_encoder/qwen_encoder.py", line 110, in __call__
    hidden_states = layer(hidden_states, attention_mask_4d, position_embeddings)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/qwen/model/qwen_text_encoder/qwen_encoder_layer.py", line 43, in __call__
    hidden_states = self.input_layernorm(hidden_states)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/qwen/model/qwen_text_encoder/qwen_rms_norm.py", line 16, in __call__
    result = self.weight.astype(mx.float32) * hidden_states
             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
ValueError: [broadcast_shapes] Shapes (3584) and (1,1024,672) cannot be broadcast.
```

### fibo ❌

I've this command:

```
mflux-generate-fibo --model model-links/fibo-4bit --steps 20 --guidance 4.0 --seed 42 --output animal_bakers.png   --prompt "Three cartoon animal chefs in a colorful bakery kitchen"
```

And then it starts to download another model:

```
Downloading model from HuggingFace: briaai/FIBO-vlm...
```

When I use this syntax:

```
mflux-generate --model model-links/fibo-4bit --base-model fibo --steps 20 --guidance 4.0 --seed 42 --output animal_bakers.png   --prompt "Three cartoon animal chefs in a colorful bakery kitchen"
```

Then I get this error:

```
Traceback (most recent call last):
  File "/Users/johndoe/dev/mflux/.venv/bin/mflux-generate", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/cli/flux_generate.py", line 28, in main
    flux = Flux1(
           ^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/variants/txt2img/flux.py", line 38, in __init__
    FluxInitializer.init(
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/flux_initializer.py", line 37, in init
    weights = FluxInitializer._load_weights(path)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/flux/flux_initializer.py", line 164, in _load_weights
    return WeightLoader.load(
           ^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/weights/loading/weight_loader.py", line 57, in load
    weights, q_level, version = WeightLoader._load_component(root_path, component, raw_weights_cache)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/weights/loading/weight_loader.py", line 99, in _load_component
    raw_weights = WeightLoader._load_safetensors(component_path, component.loading_mode)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/weights/loading/weight_loader.py", line 203, in _load_safetensors
    return WeightLoader._load_mlx_native(path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/johndoe/dev/mflux/.venv/lib/python3.11/site-packages/mflux/models/common/weights/loading/weight_loader.py", line 221, in _load_mlx_native
    raise FileNotFoundError(f"No safetensors files found in {path}")
FileNotFoundError: No safetensors files found in model-links/fibo-4bit/text_encoder_2
```

It's true that the fibo-4bit model contains only these directories:

```
model-links/fibo-4bit
├── README.md -> ../../blobs/cd1bb9bb5dc69e7285cf0f57dc975fbdd5178a05
├── text_encoder
│   └── 0.safetensors -> ../../../blobs/97e7ec568880484c86d206af82b9b5be8c1af4928e37891af8cf9d5d56152022
├── tokenizer
│   ├── chat_template.jinja -> ../../../blobs/e01e3a1bca00ae47bca8326b38cc397729f87481
│   ├── special_tokens_map.json -> ../../../blobs/190d5624dbbc1ad56f2f34c9d58e03fef7e5328b
│   ├── tokenizer_config.json -> ../../../blobs/61910c2db5cbdc9e6a6f37e14aaf00584cc6ad47
│   └── tokenizer.json -> ../../../blobs/7b6a500b662a34eb3f0374db856ba4ad7de4c81040571d78dc0d357238930005
├── transformer
│   ├── 0.safetensors -> ../../../blobs/50ee8eaa604d4299cc1126d2ac0e54f426775cfb02fc0849d1700a63b1cc17ae
│   ├── 1.safetensors -> ../../../blobs/a1b206f77d5d566bed9692bf1df0af154eeea84f3159282e484b465e8c3aee1b
│   └── 2.safetensors -> ../../../blobs/1c51d429fec6c11233296e242c71f43fe286e8c06d50e05ed3bf631bccc97ecb
└── vae
    ├── 0.safetensors -> ../../../blobs/1307796b5605b8d2cf674f0ec2f404f740e58a9f67243b163cbe2c507afc4e66
    └── 1.safetensors -> ../../../blobs/e66dbac001481641a326bfe3f1ed8dffb1e4299b5f3b32b4bfccdf999fbbb01a
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can't use quantized models of schnell, dev, qwen or fibo. Only z-image-turbo works #296

z-image-turbo ✅

flux1.schnell ❌

qwen ❌

fibo ❌

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Can't use quantized models of schnell, dev, qwen or fibo. Only z-image-turbo works #296

Description

z-image-turbo ✅

flux1.schnell ❌

qwen ❌

fibo ❌

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions