Skip to content

Artifacts (possibly related to MPS) #24

@jwooldridge234

Description

@jwooldridge234

Hey! Your model looks really cool- just wondering if you can point me in the right direction as to how to solve this issue. I'm using this code:

import torch
from diffusers import AutoPipelineForText2Image
from diffusers.pipelines.wuerstchen import DEFAULT_STAGE_C_TIMESTEPS
from pathlib import Path
import time

DIR_NAME="./images/"
dirpath = Path(DIR_NAME)
# create parent dir if doesn't exist
dirpath.mkdir(parents=True, exist_ok=True)

pipe = AutoPipelineForText2Image.from_pretrained("warp-ai/wuerstchen", torch_dtype=torch.float16).to("mps")

caption = "A grim woman wearing rusty atompunk power-armor, holding a massive gauss rifle, standing on a cliff overlooking a vast desert, 70mm film still"
negative = "3d, cartoon, doll, lowres"
images = pipe(
    prompt=caption, 
    negative_prompt=negative,
    width=1280,
    height=1024,
    prior_timesteps=DEFAULT_STAGE_C_TIMESTEPS,
    prior_guidance_scale=4.0,
    num_images_per_prompt=1,
).images

for idx, image in enumerate(images):
    image_name = f'{time.time()}.png'
    image_path = dirpath / image_name
    image.save(image_path)

This is the console output I get:

Loading pipeline components...: 100%|█████████████| 5/5 [00:07<00:00,  1.45s/it]
Loading pipeline components...: 100%|█████████████| 4/4 [00:11<00:00,  2.75s/it]
100%|███████████████████████████████████████████| 29/29 [00:40<00:00,  1.41s/it]
  0%|                                                    | 0/12 [00:00<?, ?it/s]/Users/jackwooldridge/StableDiffusion/diffusers/venv/lib/python3.9/site-packages/torch/nn/functional.py:4027: UserWarning: The operator 'aten::_upsample_bicubic2d_aa.out' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:13.)
  return torch._C._nn._upsample_bicubic2d_aa(input, output_size, align_corners, scale_factors)
100%|███████████████████████████████████████████| 12/12 [00:30<00:00,  2.54s/it]

And here's the image that gets output at the end:

1699027303 844525

I've tried with no negative prompts and different output sizes. The output always seems to have this distortion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions