Skip to content

[Z-Image] ONNX export fails due to pad_sequence / unbind / dynamic-shape ops in ZImageTransformer2DModel (request for export-friendly path) #12843

@er6y

Description

@er6y

Describe the bug

Problem: ONNX export succeeds, but exported model is not truly dynamic (size-switching fails)

The ONNX export itself does NOT fail. The exported ONNX model looks valid and can run if the inference image size matches the export-time size.

However, Z-Image in PyTorch supports running at multiple image sizes (e.g. 512 / 768 / 1024). After exporting to ONNX, this dynamic behavior is effectively lost: some shape-related values become baked into the ONNX graph as constants derived from the export-time input dimensions (e.g. token count, reshape sizes in patchify/unpatchify, padded lengths, etc.).

As a result, when the downstream inference runtime tries to run the exported ONNX with a different image_size than the one used during export (e.g. export with 1024 but infer with 512 or 768), the model fails at runtime with shape mismatch / reshape errors (typically around view/reshape-like logic and token reshaping).

In short:

  • PyTorch ZImageTransformer2DModel: supports dynamic image sizes.
  • Exported ONNX graph: contains hardcoded shapes → cannot switch sizes at inference time.

This suggests the current export path traces some dynamic-shape logic into static constants. A dedicated ONNX-friendly forward path (e.g. [forward_single]

Reference implementation (for discussion):
https://github.com/er6y/diffusers/tree/fix/zimage-transformer-onnx-friendly

this is only suitable for me to export correct ONNX, but not good for the project, so i'll only keep to my fork...
hoping the full fixing in this project..

Reproduction

import torch
from diffusers import ZImagePipeline

pipe = ZImagePipeline.from_pretrained("...") # Z-Image checkpoint
model = pipe.transformer.eval()

create dummy inputs (batch=1), cap embeddings etc...

torch.onnx.export(model, ...)

Reference implementation (for discussion only)

I have a working downstream patch in my fork/branch (mainly adds an ONNX-friendly forward_single for ZImage transformer and an export wrapper):

https://github.com/er6y/diffusers/tree/fix/zimage-transformer-onnx-friendly

i've rewrite the forwarding, but it is only for me to export ONNX, not siutable for PR...hoping fixing on this issuse

Logs

System Info

diffusers: <commit hash / version>
torch:
onnx/opset: <e.g. opset 17>
exporter: torch.onnx.export / dynamo_export
runtime target: MNN / other ONNX runtime

Who can help?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions