Conversation
Implements LoRA (Low-Rank Adaptation) adapter support for LTX-Video transformer model, following the same pattern as image generation. Changes: - Remove blocking assertion in ltx_video_transformer_3d_model.cpp - Add adapter initialization in compile() method - Implement set_adapters() method for runtime adapter control - Add AdapterController member to manage adapter lifecycle - Create lora_text2video.cpp sample demonstrating usage - Add comprehensive tests in test_video_generation.py The implementation allows users to: - Load and apply one or more LoRA adapters - Mix multiple adapters with different alpha values - Enable/disable adapters per generation call - Use community LoRA adapters from HuggingFace Fixes openvinotoolkit#3306
- Added Python binding for LTXVideoTransformer3DModel.set_adapters() to expose the method to Python, fixing test_transformer_has_set_adapters_method - Added safety check in set_adapters() to prevent crashes when AdapterController is not initialized - Both fixes follow the same patterns used in image generation models Fixes the CI failures in TestLoRAVideoGeneration test suite.
…stubgen output (24 spaces)
…erministic pyi stub
📝 WalkthroughWalkthroughThis pull request adds LoRA (Low-Rank Adaptation) adapter support to the video generation pipeline in OpenVINO GenAI. Changes include new C++ and Python sample implementations demonstrating LoRA usage, configuration structure updates to accept adapters, adapter-aware modifications to the LTX video transformer model initialization and inference, Python bindings exposure, test coverage, and documentation updates. Changes
Sequence DiagramsequenceDiagram
actor User
participant Pipeline as Text2VideoPipeline
participant T5Encoder as T5EncoderModel
participant Transformer as LTXVideoTransformer3DModel
participant Controller as AdapterController
participant VAE as AutoencoderKLLTXVideo
User->>Pipeline: Create pipeline with adapters config
Pipeline->>T5Encoder: Initialize with device + properties
Pipeline->>Transformer: Initialize with device + properties
Pipeline->>VAE: Initialize with device + properties
User->>Pipeline: Call generate(prompt, adapters=AdapterConfig)
Pipeline->>Transformer: set_adapters(adapters)
Transformer->>Controller: Create AdapterController(model, adapters, device)
Transformer->>Transformer: Store adapter controller
Pipeline->>T5Encoder: Encode text prompt
T5Encoder-->>Pipeline: Text embeddings
Pipeline->>Transformer: Infer with embeddings
Transformer->>Transformer: Apply adapters via controller
Transformer-->>Pipeline: Generated latent frames
Pipeline->>VAE: Decode latents
VAE-->>Pipeline: Video frames
Pipeline-->>User: Save video (lora_video.avi or baseline_video.avi)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (2)
tests/python_tests/test_video_generation.py (1)
299-308: Add a regression assertion for the safety-check path inset_adapters.Right now this only verifies
Noneinput. Please also assert that passing a non-empty adapter config after compile (without adapter initialization) raises a controlled exception, so the segfault-prevention path is explicitly covered.🧪 Suggested test extension
def test_transformer_has_set_adapters_method(self, video_generation_model): """Test that the LTXVideoTransformer3DModel has the set_adapters method""" model_path = Path(video_generation_model) / "transformer" if model_path.exists(): model = ov_genai.LTXVideoTransformer3DModel(str(model_path)) model.compile("CPU") assert hasattr(model, "set_adapters") model.set_adapters(None) + with pytest.raises(RuntimeError, match="Adapter controller is not initialized"): + model.set_adapters(ov_genai.AdapterConfig())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/python_tests/test_video_generation.py` around lines 299 - 308, Extend the test_transformer_has_set_adapters_method to cover the safety-check path in LTXVideoTransformer3DModel.set_adapters by asserting that after model.compile("CPU") calling model.set_adapters with a non-empty adapter config (e.g., a dict or object representing adapters) raises a controlled exception; wrap the call in pytest.raises(Exception) (or the specific exception class your implementation raises, e.g., RuntimeError/ValueError) to ensure the post-compile-but-uninitialized-adapters path is exercised and prevents a segfault.samples/cpp/video_generation/CMakeLists.txt (1)
51-76: LGTM! The LoRA sample executable is correctly configured.The CMake configuration for
lora_text2videocorrectly mirrors the existingtext2videotarget with proper include directories, library linking, RPATH handling, and installation rules.For maintainability, consider extracting the common configuration into a CMake function to reduce duplication:
♻️ Optional refactor using a helper function
function(add_video_generation_sample TARGET_NAME SOURCE_FILE) add_executable(${TARGET_NAME} ${SOURCE_FILE} imwrite_video.cpp) target_include_directories(${TARGET_NAME} PRIVATE ${CMAKE_BINARY_DIR} "${CMAKE_CURRENT_SOURCE_DIR}/../image_generation/" "${CMAKE_CURRENT_SOURCE_DIR}/../../../src/cpp/src/") ov_genai_link_opencv(${TARGET_NAME} core imgproc videoio imgcodecs) target_link_libraries(${TARGET_NAME} PRIVATE openvino::genai indicators::indicators) if(UNIX AND NOT APPLE) set_target_properties(${TARGET_NAME} PROPERTIES INSTALL_RPATH "$ORIGIN/../lib") elseif(APPLE) set_target_properties(${TARGET_NAME} PROPERTIES INSTALL_RPATH "@loader_path/../lib") endif() set_target_properties(${TARGET_NAME} PROPERTIES INSTALL_RPATH_USE_LINK_PATH ON) install(TARGETS ${TARGET_NAME} RUNTIME DESTINATION samples_bin/ COMPONENT samples_bin EXCLUDE_FROM_ALL) endfunction() # Usage: add_video_generation_sample(text2video text2video.cpp) add_video_generation_sample(lora_text2video lora_text2video.cpp)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@samples/cpp/video_generation/CMakeLists.txt` around lines 51 - 76, The CMakeBloc duplicates target setup for text2video and lora_text2video; refactor by adding a helper function (e.g., add_video_generation_sample) that encapsulates add_executable, target_include_directories, ov_genai_link_opencv, target_link_libraries, the INSTALL_RPATH/INSTALL_RPATH_USE_LINK_PATH logic, and the install() call, then replace the existing explicit target blocks with calls like add_video_generation_sample(text2video text2video.cpp) and add_video_generation_sample(lora_text2video lora_text2video.cpp) so both targets use the same centralized configuration.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@samples/cpp/video_generation/lora_text2video.cpp`:
- Line 15: The usage string passed to OPENVINO_ASSERT in the program's argument
check contains a duplicated closing bracket at the end; update the
OPENVINO_ASSERT call (the one that asserts argc >= 3 && (argc - 3) % 2 == 0) to
remove the extra trailing ']' so the usage message ends with "...
[<LORA_SAFETENSORS> <ALPHA> ...]" (i.e., change the final "...]]" to "...]").
- Around line 27-28: Replace the silent atof parsing for LoRA alpha with
exception-safe std::stof parsing: in the code that reads argv for alpha (used
when calling adapter_config.add(adapter, alpha)), wrap std::stof(argv[...]) in a
try/catch (catch std::invalid_argument and std::out_of_range) and report a clear
CLI error (and exit) when parsing fails; optionally validate the parsed float
range before calling adapter_config.add. Also fix the usage string (the printed
help/usage message) to remove the extra trailing bracket by changing '...]]' to
'...]'.
In `@samples/python/video_generation/lora_text2video.py`:
- Line 69: The print call uses an unnecessary f-string in print(f"\nPerformance
metrics:") which triggers Ruff F541; change it to a normal string by removing
the f prefix (update the print statement in
samples/python/video_generation/lora_text2video.py so the call is
print("\nPerformance metrics:") wherever that print is defined).
- Around line 22-23: The VideoWriter may fail to initialize silently; after
creating writer = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
check writer.isOpened() and handle the failure (raise an exception or log an
error and exit) using output_path and any relevant error context; update the
code around the writer creation (the cv2.VideoWriter call and subsequent frame
writing loop) to bail out early if isOpened() is False so you don't attempt to
write frames or print a misleading success message.
In `@src/cpp/src/video_generation/ltx_pipeline.hpp`:
- Around line 425-428: The first constructor leaves m_text_encode_device,
m_denoise_device, m_vae_device and m_compile_properties uninitialized which can
cause rebuild_models() (called from generate()) to recompile models with garbage
device/compiler settings; fix by initializing those members to sensible defaults
(e.g., empty strings for m_text_encode_device/m_denoise_device/m_vae_device and
a default-constructed m_compile_properties) in the first constructor, or add a
defensive guard at the start of rebuild_models() that validates/assigns defaults
to m_text_encode_device, m_denoise_device, m_vae_device and m_compile_properties
(and returns/throws if they remain invalid) before constructing T5EncoderModel,
LTXVideoTransformer3DModel or AutoencoderKLLTXVideo.
In `@tests/python_tests/requirements.txt`:
- Line 29: The requirements line 'sentence-transformers>=2.2.2,<=5.2.2' should
be removed if unused, or changed to an exact pinned version to match the repo
pattern; either delete that dependency entry from the requirements file (if no
imports of sentence-transformers are present) or replace the ranged specifier
with an exact pin such as 'sentence-transformers==5.2.2' (or the specific
approved version) to ensure reproducible tests.
---
Nitpick comments:
In `@samples/cpp/video_generation/CMakeLists.txt`:
- Around line 51-76: The CMakeBloc duplicates target setup for text2video and
lora_text2video; refactor by adding a helper function (e.g.,
add_video_generation_sample) that encapsulates add_executable,
target_include_directories, ov_genai_link_opencv, target_link_libraries, the
INSTALL_RPATH/INSTALL_RPATH_USE_LINK_PATH logic, and the install() call, then
replace the existing explicit target blocks with calls like
add_video_generation_sample(text2video text2video.cpp) and
add_video_generation_sample(lora_text2video lora_text2video.cpp) so both targets
use the same centralized configuration.
In `@tests/python_tests/test_video_generation.py`:
- Around line 299-308: Extend the test_transformer_has_set_adapters_method to
cover the safety-check path in LTXVideoTransformer3DModel.set_adapters by
asserting that after model.compile("CPU") calling model.set_adapters with a
non-empty adapter config (e.g., a dict or object representing adapters) raises a
controlled exception; wrap the call in pytest.raises(Exception) (or the specific
exception class your implementation raises, e.g., RuntimeError/ValueError) to
ensure the post-compile-but-uninitialized-adapters path is exercised and
prevents a segfault.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
samples/cpp/video_generation/CMakeLists.txtsamples/cpp/video_generation/README.mdsamples/cpp/video_generation/lora_text2video.cppsamples/python/video_generation/README.mdsamples/python/video_generation/lora_text2video.pysrc/cpp/include/openvino/genai/video_generation/generation_config.hppsrc/cpp/include/openvino/genai/video_generation/ltx_video_transformer_3d_model.hppsrc/cpp/src/video_generation/generation_config_utils.cppsrc/cpp/src/video_generation/ltx_pipeline.hppsrc/cpp/src/video_generation/models/ltx_video_transformer_3d_model.cppsrc/python/openvino_genai/py_openvino_genai.pyisrc/python/py_video_generation_models.cpptests/python_tests/requirements.txttests/python_tests/test_video_generation.pytests/python_tests/test_vlm_pipeline.py
|
@codex review |
This PR builds upon the work started by @Eshaan-byte in openvinotoolkit#3309. I've integrated his logic and added the fixes for the set_adapters bindings and AdapterController initialization to resolve the CI issues.
Changes:
LTXVideoTransformer3DModel.set_adapters()to resolveAttributeErrorin theTestLoRAVideoGenerationtest suite.set_adapters()to prevent segmentation faults when theAdapterControlleris not initialized (e.g., when no adapters are provided at construction).Closes openvinotoolkit#3306
Supersedes openvinotoolkit#3309
Verification Results:
without LoRA

with LoRA (Adapter: ltxv_095_cakeify_lora.safetensors)

These samples demonstrate that the LoRA adapter is successfully loaded and applied during the denoising process in the Text2VideoPipeline.
Environment:
Model: LTX-Video
Alpha: 1.5
Prompt:"A cute golden retriever puppy running in a green grassy field on a sunny day, high quality, photorealistic"
Checklist:
tests/python_tests/test_video_generation.pypasses locally.Summary by CodeRabbit
Release Notes
New Features
Documentation