Skip to content

[quant] about float8_e4m3_tensor #11468

Closed
@1145284121

Description

@1145284121

Describe the bug

`quantization_config = TorchAoConfig("float8wo_e4m3")

transformer = AutoModel.from_pretrained(
"models/community_hunyuanvideo",
subfolder="transformer",
quantization_config=quantization_config,
torch_dtype=torch.bfloat16,
)`

The following error was encountered

quantization_config = TorchAoConfig("float8_e4m3_row") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/diffusers/quantizers/quantization_config.py", line 502, in __init__ raise ValueError( ValueError: Requested quantization type: float8_e4m3_row is not supported or is an incorrectquant_type name. If you think the provided quantization type should be supported, please open an issue at https://github.com/huggingface/diffusers/issues.

But the documentation of diffusers_torchao looks like supports this methods
https://huggingface.co/docs/diffusers/main/en/quantization/torchao

Image

Reproduction

`quantization_config = TorchAoConfig("float8wo_e4m3")

transformer = AutoModel.from_pretrained(
"models/community_hunyuanvideo",
subfolder="transformer",
quantization_config=quantization_config,
torch_dtype=torch.bfloat16,
)`

Logs

System Info

diffusers-cli env

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

  • 🤗 Diffusers version: 0.33.1
  • Platform: Linux-5.4.250-4-velinux1u1-amd64-x86_64-with-glibc2.39
  • Running on Google Colab?: No
  • Python version: 3.12.3
  • PyTorch version (GPU?): 2.6.0+cu124 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Huggingface_hub version: 0.29.2
  • Transformers version: 4.49.0
  • Accelerate version: 1.1.1
  • PEFT version: not installed
  • Bitsandbytes version: not installed
  • Safetensors version: 0.4.5
  • xFormers version: 0.0.29.post3
  • Accelerator: NVIDIA H20, 97871 MiB
    NVIDIA H20, 97871 MiB
    NVIDIA H20, 97871 MiB
    NVIDIA H20, 97871 MiB
    NVIDIA H20, 97871 MiB
    NVIDIA H20, 97871 MiB
    NVIDIA H20, 97871 MiB
    NVIDIA H20, 97871 MiB
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions