Skip to content

Commit e01a919

Browse files
authored
Add Qwen-Image (#2348)
1 parent 840e497 commit e01a919

File tree

12 files changed

+1236
-1
lines changed

12 files changed

+1236
-1
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -303,11 +303,13 @@ The following model architectures, tasks and device distributions have been vali
303303
| Stable Diffusion 3 | :heavy_check_mark: | :heavy_check_mark: | <ul><li>[text-to-image generation](/examples/stable-diffusion#stable-diffusion-3-and-35-sd3)</li></ul> |
304304
| LDM3D | | <ul><li>Single card</li></ul> | <ul><li>[text-to-image generation](/examples/stable-diffusion#text-to-image-generation)</li></ul> |
305305
| FLUX.1 | <ul><li>LoRA</li></ul> | <ul><li>Single card</li></ul> | <ul><li>[text-to-image generation](/examples/stable-diffusion#flux1)</li><li>[image-to-image generation](/examples/stable-diffusion#flux1-image-to-image)</li></ul> |
306+
| Qwen Image | | <ul><li>Single card</li></ul> | <ul><li>[text-to-image generation](/examples/stable-diffusion#qwen-image)</li></ul> |
306307
| Text to Video | | <ul><li>Single card</li></ul> | <ul><li>[text-to-video generation](/examples/stable-diffusion#text-to-video-generation)</li></ul> |
307308
| Image to Video | | <ul><li>Single card</li></ul> | <ul><li>[image-to-video generation](/examples/stable-diffusion#image-to-video-generation)</li></ul> |
308309
| i2vgen-xl | | <ul><li>Single card</li></ul> | <ul><li>[image-to-video generation](/examples/stable-diffusion#I2vgen-xl)</li></ul> |
309310
| Wan | | :heavy_check_mark: | <ul><li>[text-to-video generation](/examples/stable-diffusion#text-to-video-with-wan-22)</li><li>[image-to-video generation](/examples/stable-diffusion#image-to-video-with-wan-22)</li></ul> |
310311

312+
311313
### PyTorch Image Models/TIMM:
312314

313315
| Architecture | Training | Inference | Tasks |

docs/source/index.mdx

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,7 @@ In the tables below, ✅ means single-card, multi-card and DeepSpeed have all be
120120

121121
- Diffusers
122122

123+
123124
| Architecture | Training. | Inference | Tasks |
124125
|----------------------------|:----------------------:|:-----------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
125126
| Stable Diffusion ||| <ul><li>[text-to-image generation](/examples/stable-diffusion)</li></ul> |
@@ -128,11 +129,13 @@ In the tables below, ✅ means single-card, multi-card and DeepSpeed have all be
128129
| Stable Diffusion 3 ||| <ul><li>[text-to-image generation](/examples/stable-diffusion#stable-diffusion-3-and-35-sd3)</li></ul> |
129130
| LDM3D | | <ul><li>Single card</li></ul> | <ul><li>[text-to-image generation](/examples/stable-diffusion)</li></ul> |
130131
| FLUX.1 | <ul><li>LoRA</li></ul> | <ul><li>Single card</li></ul> | <ul><li>[text-to-image generation](/examples/stable-diffusion)</li></ul> |
132+
| Qwen Image | | <ul><li>Single card</li></ul> | <ul><li>[text-to-image generation](https://github.com/huggingface/optimum-habana/tree/main/examples/stable-diffusion)</li></ul> |
131133
| Text to Video | | <ul><li>Single card</li></ul> | <ul><li>[text-to-video generation](/examples/stable-diffusion#text-to-video-generation)</li></ul> |
132134
| Image to Video | | <ul><li>Single card</li></ul> | <ul><li>[image-to-video generation](/examples/stable-diffusion#image-to-video-generation)</li></ul> |
133135
| i2vgen-xl | | <ul><li>Single card</li></ul> | <ul><li>[image-to-video generation](/examples/stable-diffusion#I2vgen-xl)</li></ul> |
134136
| Wan | || <ul><li>[text-to-video generation](/examples/stable-diffusion#text-to-video-with-wan-22)</li><li>[image-to-video generation](/examples/stable-diffusion#image-to-video-with-wan-22)</li></ul> |
135137

138+
136139
- PyTorch Image Models/TIMM:
137140

138141
| Architecture | Training | Inference | Tasks |

examples/stable-diffusion/README.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,31 @@ FLUX in quantization mode by setting runtime variable `QUANT_CONFIG=quantization
178178
179179
To run with FLUX.1-schnell model, a distilled version of FLUX.1 (which is not gated), use `--model_name_or_path black-forest-labs/FLUX.1-schnell`.
180180

181+
### Qwen-Image
182+
183+
Qwen-Image was introduced Alibaba Cloud [here](https://www.alibabacloud.com/blog/introducing-qwen-image-novel-model-in-image-generation-and-editing_602447)
184+
185+
Here is how to run Qwen-Image model:
186+
187+
```bash
188+
PT_HPU_LAZY_MODE=1 python text_to_image_generation.py \
189+
--model_name_or_path Qwen/Qwen-Image \
190+
--prompts "A cat holding a sign that says hello world" \
191+
--negative_prompts " " \
192+
--num_images_per_prompt 10 \
193+
--batch_size 1 \
194+
--num_inference_steps 10 \
195+
--image_save_dir /tmp/qwen-image \
196+
--scheduler flow_match_euler_discrete \
197+
--use_habana \
198+
--use_hpu_graphs \
199+
--gaudi_config Habana/stable-diffusion \
200+
--sdp_on_bf16 \
201+
--bf16
202+
```
203+
204+
> [!NOTE]
205+
> If you don't add `--negative_prompts` then empty string will be added to it as default value.
181206
## ControlNet
182207

183208
ControlNet was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models](https://huggingface.co/papers/2302.05543)

examples/stable-diffusion/text_to_image_generation.py

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -325,9 +325,11 @@ def main():
325325
sdxl_models = ["stable-diffusion-xl", "sdxl"]
326326
sd3_models = ["stable-diffusion-3", "sd3"]
327327
flux_models = ["FLUX.1", "flux"]
328+
qwen_models = ["Qwen-Image", "qwen"]
328329
sdxl = True if any(model in args.model_name_or_path for model in sdxl_models) else False
329330
sd3 = True if any(model in args.model_name_or_path for model in sd3_models) else False
330331
flux = True if any(model in args.model_name_or_path for model in flux_models) else False
332+
qwen = True if any(model in args.model_name_or_path for model in qwen_models) else False
331333
controlnet = True if args.control_image is not None else False
332334
inpainting = True if (args.base_image is not None) and (args.mask_image is not None) else False
333335

@@ -549,6 +551,24 @@ def main():
549551
**kwargs,
550552
)
551553

554+
elif qwen:
555+
# QwenImage pipelines
556+
if controlnet:
557+
raise ValueError("QwenImage+ControlNet pipeline is not currenly supported")
558+
elif inpainting:
559+
raise ValueError("QwenImage Inpainting pipeline is not currenly supported")
560+
else:
561+
if negative_prompts is None:
562+
logger.warning("Adding an empty string, because you do not have specific concept to remove.")
563+
kwargs_call["negative_prompt"] = " "
564+
565+
from optimum.habana.diffusers import GaudiQwenImagePipeline
566+
567+
pipeline = GaudiQwenImagePipeline.from_pretrained(
568+
args.model_name_or_path,
569+
**kwargs,
570+
)
571+
552572
else:
553573
# SD pipelines (SD1.x, SD2.x)
554574
if controlnet:

optimum/habana/diffusers/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
from .pipelines.flux.pipeline_flux_img2img import GaudiFluxImg2ImgPipeline
1111
from .pipelines.i2vgen_xl.pipeline_i2vgen_xl import GaudiI2VGenXLPipeline
1212
from .pipelines.pipeline_utils import GaudiDiffusionPipeline
13+
from .pipelines.qwenimage.pipeline_qwenimage import GaudiQwenImagePipeline
1314
from .pipelines.stable_diffusion.pipeline_stable_diffusion import GaudiStableDiffusionPipeline
1415
from .pipelines.stable_diffusion.pipeline_stable_diffusion_depth2img import GaudiStableDiffusionDepth2ImgPipeline
1516
from .pipelines.stable_diffusion.pipeline_stable_diffusion_image_variation import (
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
from .transformer_qwenimage import (
2+
GaudiQwenDoubleStreamAttnProcessor2_0,
3+
GaudiQwenEmbedRope,
4+
GaudiQwenTimestepProjEmbeddings,
5+
)

0 commit comments

Comments
 (0)