[feature]: support flux2.klein cache_dit#1209
[feature]: support flux2.klein cache_dit#1209nuclearwu wants to merge 3 commits intovllm-project:mainfrom
Conversation
Signed-off-by: wuzhongjian <wuzhongjian_yewu@cmss.chinamobile.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 656f2b142f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Signed-off-by: wuzhongjian <wuzhongjian_yewu@cmss.chinamobile.com>
ZJY0516
left a comment
There was a problem hiding this comment.
LGTM. But also cc @SamitHuang
There was a problem hiding this comment.
Pull request overview
This pull request adds cache-dit acceleration support for the FLUX.2-klein-4B diffusion model. The implementation follows established patterns in the codebase and enables significant performance improvements (1.4-1.5x speedup) when cache_dit is enabled.
Changes:
- Added
enable_cache_for_flux2_kleinfunction to enable cache-dit for FLUX.2-klein-4B pipeline with model-specific configuration (Fn_compute_blocks=2, forward patterns Pattern_1 and Pattern_2) - Registered the new pipeline in
CUSTOM_DIT_ENABLERSdictionary - Updated documentation to list FLUX.2-klein as supported by cache-dit acceleration
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| vllm_omni/diffusion/cache/cache_dit_backend.py | Adds cache-dit enabler function for FLUX.2-klein with model-specific configuration and registers it in the pipeline map |
| docs/user_guide/diffusion_acceleration.md | Updates supported models table to include FLUX.2-klein with cache-dit support |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: wuzhongjian <wuzhongjian_yewu@cmss.chinamobile.com>
Signed-off-by: wuzhongjian wuzhongjian_yewu@cmss.chinamobile.com
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
support flux2.klein cache_dit
Test Plan
Test Result
vLLM-Omni:
Reproduced with 4xA800.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)