-
Notifications
You must be signed in to change notification settings - Fork 205
Description
It would be good to have possibility to run ltx video in BF16 format with OpenVINO and optimum-intel. I tried to convert the ltx video model in bf16 format in several ways, but it looks like, that I didn't get fully correct results with both.
The first way. I have loaded Lightricks/LTX-Video and save it in torch.bfloat16 format. Then I have converted the model with OVLTXPipeline API:
from diffusers import LTXPipeline
from optimum.intel import OVLTXPipeline
pipeline = LTXPipeline.from_pretrained("Lightricks/LTX-Video", torch_dtype=torch.bfloat16)
pipeline.save_pretrained("./models/LTX-Video_hf_bf16/")
ov_model = OVLTXPipeline.from_pretrained("./models/LTX-Video_hf_fp16/", device="CPU")
ov_model.save_pretrained("./models/LTX-Video_ov_bf16/")
Transformer model of LTXPipeline diffusion_pytorch_model.safetensors has size ~3.7GB in bfloat16 type and ~7.5 GB in FP32 type. But .bin file of converted transformer model has weight ~7.5GB, it's like FP32 model. There is no mention of the BF16 format in XLM. It looks like optimum-intel incorrectly identifies the model type(probably here openvino/utils.py)
The scond way. I have tried to convert FP16 version of LTX-Video model. There are, for example, Lightricks/LTX-Video-0.9.7-dev , Lightricks/LTX-Video-0.9.8-13B-distilled, Lightricks/LTX-Video-0.9.5.
optimum-cli export openvino --model Lightricks/LTX-Video-0.9.7-dev --task text-to-video ./models/Lightricks/LTX-Video-0.9.7-dev
take error:
I got error in conversion:
Traceback (most recent call last):
File "./env/lib/python3.10/site-packages/openvino/frontend/pytorch/ts_decoder.py", line 72, in __init__
pt_module = self._get_scripted_model(
File "./env/lib/python3.10/site-packages/openvino/frontend/pytorch/ts_decoder.py", line 178, in _get_scripted_model
scripted = torch.jit.trace(
File "./env/lib/python3.10/site-packages/torch/jit/_trace.py", line 1002, in trace
traced_func = _trace_impl(
File "./env/lib/python3.10/site-packages/torch/jit/_trace.py", line 696, in _trace_impl
return trace_module(
File "./env/lib/python3.10/site-packages/torch/jit/_trace.py", line 1282, in trace_module
module._c._create_method_from_trace(
File "./env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "./env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
File "./env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1763, in _slow_forward
result = self.forward(*input, **kwargs)
File "./env/lib/python3.10/site-packages/optimum/exporters/openvino/convert.py", line 398, in ts_patched_forward
outputs = patched_forward(**kwargs)
File "./env/lib/python3.10/site-packages/optimum/exporters/onnx/model_patcher.py", line 596, in patched_forward
outputs = self.orig_forward(*args, **kwargs)
File "./env/lib/python3.10/site-packages/optimum/exporters/openvino/convert.py", line 1035, in <lambda>
vae_encoder.forward = lambda sample: {"latent_parameters": vae_encoder.encode(x=sample)["latent_dist"].parameters}
File "./env/lib/python3.10/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "./env/lib/python3.10/site-packages/diffusers/models/autoencoders/autoencoder_kl_ltx.py", line 1276, in encode
h = self._encode(x)
File "./env/lib/python3.10/site-packages/diffusers/models/autoencoders/autoencoder_kl_ltx.py", line 1252, in _encode
enc = self.encoder(x)
File "./env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "./env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
File "./env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1763, in _slow_forward
result = self.forward(*input, **kwargs)
File "./env/lib/python3.10/site-packages/diffusers/models/autoencoders/autoencoder_kl_ltx.py", line 866, in forward
hidden_states = down_block(hidden_states)
File "./env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "./env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
File "./env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1763, in _slow_forward
result = self.forward(*input, **kwargs)
File "./env/lib/python3.10/site-packages/diffusers/models/autoencoders/autoencoder_kl_ltx.py", line 513, in forward
hidden_states = downsampler(hidden_states)
File "./env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "./env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
File "./env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1763, in _slow_forward
result = self.forward(*input, **kwargs)
File "./env/lib/python3.10/site-packages/diffusers/models/autoencoders/autoencoder_kl_ltx.py", line 230, in forward
.unflatten(2, (-1, self.stride[0]))
File "./env/lib/python3.10/site-packages/torch/_tensor.py", line 1433, in unflatten
return super().unflatten(dim, sizes)
RuntimeError: unflatten: Provided sizes [-1, 2] don't multiply up to the size of dim 2 (3) in the input tensor
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./env/bin/optimum-cli", line 8, in <module>
sys.exit(main())
File "./env/lib/python3.10/site-packages/optimum/commands/optimum_cli.py", line 219, in main
service.run()
File "./env/lib/python3.10/site-packages/optimum/commands/export/openvino.py", line 469, in run
main_export(
File "./env/lib/python3.10/site-packages/optimum/exporters/openvino/__main__.py", line 524, in main_export
submodel_paths = export_from_model(
File "./env/lib/python3.10/site-packages/optimum/exporters/openvino/convert.py", line 740, in export_from_model
export_models(
File "./env/lib/python3.10/site-packages/optimum/exporters/openvino/convert.py", line 509, in export_models
export(
File "./env/lib/python3.10/site-packages/optimum/exporters/openvino/convert.py", line 211, in export
return export_pytorch(
File "./env/lib/python3.10/site-packages/optimum/exporters/openvino/convert.py", line 416, in export_pytorch
ts_decoder = TorchScriptPythonDecoder(model, example_input=dummy_inputs, **ts_decoder_kwargs)
File "./env/lib/python3.10/site-packages/openvino/frontend/pytorch/ts_decoder.py", line 84, in __init__
raise RuntimeError(
RuntimeError: Couldn't get TorchScript module by tracing.
Exception:
unflatten: Provided sizes [-1, 2] don't multiply up to the size of dim 2 (3) in the input tensor
Please check correctness of provided 'example_input'. Sometimes models can be converted in scripted mode, please try running conversion without 'example_input'.
You can also provide TorchScript module that you obtained yourself, please refer to PyTorch documentation: https://pytorch.org/tutorials/beginner/Intro_to_TorchScript_tutorial.html.