Skip to content

Question about Table. 1 #3

@weleen

Description

@weleen

Thank you for your excellent work.

I have some questions about latency.

I attempted to run StableVideoDiffusionPipeline from diffusers directly to reproduce the latency result in Tab. 1 with an A100 80G GPU. However, my testing results are higher than what you reported. Could you help to check if something wrong about the implementation?

image
import torch
from PIL import Image
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import load_image
import packaging

# import torch._dynamo
# torch._dynamo.reset()

if packaging.version.parse(torch.__version__) >= packaging.version.parse('1.12.0'):
    torch.backends.cuda.matmul.allow_tf32 = True

# weight_dtype = torch.bfloat16
weight_dtype = torch.float16

pipeline = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid", variant="fp16"
).to("cuda", dtype=weight_dtype)

# test 1-step inference speed
img = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png")
generator = torch.manual_seed(42)
for _ in range(10):
    with torch.no_grad(), torch.autocast("cuda", dtype=weight_dtype):
        frames = pipeline(img, decode_chunk_size=7, generator=generator, motion_bucket_id=127, fps=7, max_guidance_scale=1.0, num_inference_steps=1).frames[0] # set max_guidance_scale=1.0 to disable cfg

The inference speed is attached as follow:
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions