[Chore]: refactor out unused/redundant params in diffusion pipelines#1235
[Chore]: refactor out unused/redundant params in diffusion pipelines#1235fhfuih wants to merge 4 commits intovllm-project:mainfrom
Conversation
…ne.forward Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
There was a problem hiding this comment.
💡 Codex Review
vllm-omni/vllm_omni/diffusion/models/flux/pipeline_flux.py
Lines 632 to 633 in 80edb73
When req.prompts contains normal strings and no embeddings (the typical case), prompt_embeds and negative_prompt_embeds are only assigned inside the if any(...) blocks above, so they remain unbound and the subsequent check_inputs/encode_prompt usage raises UnboundLocalError before any generation. Previously these were defaulted to None via the function signature, so this is a regression. Initialize both variables to None before the conditional (the same pattern appears in longcat_image, ovis_image, qwen_image, sd3, stable_audio, and z_image pipelines).
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
This PR refactors the forward methods in diffusion pipeline implementations to remove unused and redundant parameters. As discussed in PR #797 and #1196, these parameters were never used in practice since the forward function is always called with only an OmniDiffusionRequest object. This cleanup makes the API clearer and teaches developers the correct paradigm for adding new models.
Changes:
- Removed unused function parameters from forward methods (prompt, height, width, num_inference_steps, guidance_scale, generator, latents, prompt_embeds, negative_prompt_embeds, etc.)
- Consolidated parameter extraction to use only
req.sampling_paramsandreq.prompts - Added explicit extraction logic for prompt_embeds and negative_prompt_embeds from request prompts
- Standardized default value fallback patterns across pipelines
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| vllm_omni/diffusion/models/z_image/pipeline_z_image.py | Removed 11 unused parameters; consolidated to extract all values from req object |
| vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_ti2v.py | Removed 9 unused parameters; added explicit prompt_embeds extraction |
| vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2_i2v.py | Removed 9 unused parameters; added explicit prompt_embeds extraction |
| vllm_omni/diffusion/models/wan2_2/pipeline_wan2_2.py | Removed 9 unused parameters; added explicit prompt_embeds extraction |
| vllm_omni/diffusion/models/stable_audio/pipeline_stable_audio.py | Removed 10 unused parameters; added prompt_embeds extraction logic |
| vllm_omni/diffusion/models/sd3/pipeline_sd3.py | Removed 10 unused parameters; consolidated parameter extraction from req |
| vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_layered.py | Removed 13 unused parameters; added image extraction from multi_modal_data |
| vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_edit_plus.py | Removed 13 unused parameters; added image extraction from multi_modal_data |
| vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image_edit.py | Removed 13 unused parameters; added image extraction from multi_modal_data |
| vllm_omni/diffusion/models/qwen_image/pipeline_qwen_image.py | Removed 12 unused parameters; consolidated all extraction from req |
| vllm_omni/diffusion/models/ovis_image/pipeline_ovis_image.py | Removed 12 unused parameters; added prompt_embeds extraction logic |
| vllm_omni/diffusion/models/longcat_image/pipeline_longcat_image_edit.py | Removed 9 unused parameters; added image extraction from multi_modal_data |
| vllm_omni/diffusion/models/longcat_image/pipeline_longcat_image.py | Removed 11 unused parameters; reorganized parameter extraction |
| vllm_omni/diffusion/models/flux2_klein/pipeline_flux2_klein.py | Removed 11 unused parameters; simplified prompt and image extraction |
| vllm_omni/diffusion/models/flux/pipeline_flux.py | Removed 11 unused parameters; added detailed prompt_embeds extraction |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| image = [PIL.Image.open(im) if isinstance(im, str) else cast(PIL.Image.Image, im) for im in raw_image] | ||
| else: | ||
| image = PIL.Image.open(raw_image) if isinstance(raw_image, str) else cast(PIL.Image.Image, raw_image) | ||
|
|
There was a problem hiding this comment.
If raw_image is None (line 637), then image will be set to None (line 638). However, on line 644, the code attempts to access image[0].size or image.size, which will raise an AttributeError if image is None. This code path should either handle the None case or ensure that image is always set to a valid value before reaching line 644.
| if image is None: | |
| raise ValueError( | |
| "No image was provided in 'multi_modal_data' for fallback preprocessing; " | |
| "an image is required to compute target dimensions." | |
| ) |
There was a problem hiding this comment.
Yeah, many pipelines have strange type annotation that mismatches later type checks. The are confusing and originally conflicting. I am not to fix everything in this PR.
| ) | ||
| sigmas = req.sampling_params.sigmas | ||
| max_sequence_length = req.sampling_params.max_sequence_length or 512 | ||
| guidance_scale = req.sampling_params.guidance_scale if req.sampling_params.guidance_rescale is not None else 5.0 |
There was a problem hiding this comment.
The condition should check guidance_scale_provided instead of guidance_rescale. This is inconsistent with all other pipelines which use guidance_scale_provided to determine if the user explicitly provided a guidance scale. The current condition checks guidance_rescale (a different parameter), which will likely always evaluate to not None since it has a default value of 0.0, causing the guidance_scale logic to behave incorrectly.
| guidance_scale = req.sampling_params.guidance_scale if req.sampling_params.guidance_rescale is not None else 5.0 | |
| guidance_scale = ( | |
| req.sampling_params.guidance_scale | |
| if req.sampling_params.guidance_scale_provided | |
| else 5.0 | |
| ) |
There was a problem hiding this comment.
This logic is already there. I don't want to break things.
vllm_omni/diffusion/models/stable_audio/pipeline_stable_audio.py
Outdated
Show resolved
Hide resolved
vllm_omni/diffusion/models/longcat_image/pipeline_longcat_image.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Signed-off-by: Huang, Zeyu <11222265+fhfuih@users.noreply.github.com>
Purpose
As is discussed before in #797
many diffusion pipelines have several extra parameters defined in
forwardfunction. They have never been used---the forward function has always been called with only oneOmniDiffusionRequestobject (even before PR 797).This PR does this refactor. In particular, it also aims to address this discussion: #1196
and ships in companion with #1196---so that the "how to add a new model" documentation teaches developers to follow the correct paradigm.
Test Plan
No new features are added, no logic is changed. Will just run existing tests
Test Result
To be updated
Additional notes
In this refactor
forward, onlyreqis passed). Only their default values are usedorinstead ofif .. is not None, the "default" values are applied when the user-passed values are None or 0. Please see my argument below why this is acceptable.Why it is acceptable to apply alternative default values when the user passes 0:
forwardfunction signature. It means the pipeline authors are intended to adopt these parameters (but could not in the current codebase). So it makes sense to use these values when the user-passed value is invalid on purpose (i.e., explicit 0)Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)