[feature]: support flux2.klein cache_dit by nuclearwu · Pull Request #1209 · vllm-project/vllm-omni

nuclearwu · 2026-02-05T02:26:42Z

Signed-off-by: wuzhongjian wuzhongjian_yewu@cmss.chinamobile.com

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

support flux2.klein cache_dit

Test Plan

python examples/offline_inference/text_to_image/text_to_image.py \
  --model /workspace/cache/ymttest/johnjan/models/black-forest-labs/FLUX___2-klein-4B/ \
  --prompt "A cat holding a sign that says hello world" \
  --seed 42 \
  --tensor_parallel_size 1 \
  --num_images_per_prompt 1 \
  --num_inference_steps 50 \
  --guidance_scale 4.0 \
  --cache_backend cache_dit \
  --height 1024\
  --width 1024\
  --output outputs/flux-klein.png

Test Result

vLLM-Omni:
Reproduced with 4xA800.

Model/TP	TP=1	TP=2	TP=4	cache_dit
FLUX.2-klein-4B
Time	14.9125s/img	12.9658s/img	10.8902s/img	no
Time	10.3670s/img	9.2590s/img	7.5931s/img	yes

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Signed-off-by: wuzhongjian <wuzhongjian_yewu@cmss.chinamobile.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 656f2b142f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm_omni/diffusion/cache/cache_dit_backend.py

Signed-off-by: wuzhongjian <wuzhongjian_yewu@cmss.chinamobile.com>

nuclearwu · 2026-02-06T01:16:59Z

cc @hsliuustc0106 @ZJY0516

ZJY0516

LGTM. But also cc @SamitHuang

Copilot

Pull request overview

This pull request adds cache-dit acceleration support for the FLUX.2-klein-4B diffusion model. The implementation follows established patterns in the codebase and enables significant performance improvements (1.4-1.5x speedup) when cache_dit is enabled.

Changes:

Added enable_cache_for_flux2_klein function to enable cache-dit for FLUX.2-klein-4B pipeline with model-specific configuration (Fn_compute_blocks=2, forward patterns Pattern_1 and Pattern_2)
Registered the new pipeline in CUSTOM_DIT_ENABLERS dictionary
Updated documentation to list FLUX.2-klein as supported by cache-dit acceleration

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
vllm_omni/diffusion/cache/cache_dit_backend.py	Adds cache-dit enabler function for FLUX.2-klein with model-specific configuration and registers it in the pipeline map
docs/user_guide/diffusion_acceleration.md	Updates supported models table to include FLUX.2-klein with cache-dit support

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

vllm_omni/diffusion/cache/cache_dit_backend.py

Signed-off-by: wuzhongjian <wuzhongjian_yewu@cmss.chinamobile.com>

[feature]: support flux2.klein cache_dit

656f2b1

Signed-off-by: wuzhongjian <wuzhongjian_yewu@cmss.chinamobile.com>

nuclearwu requested a review from hsliuustc0106 as a code owner February 5, 2026 02:26

chatgpt-codex-connector bot reviewed Feb 5, 2026

View reviewed changes

vllm_omni/diffusion/cache/cache_dit_backend.py Show resolved Hide resolved

[feature]: support flux2.klein cache_dit

6e04607

Signed-off-by: wuzhongjian <wuzhongjian_yewu@cmss.chinamobile.com>

wtomin mentioned this pull request Feb 5, 2026

[RFC]: Continuous Diffusion Model Acceleration Support #1217

Open

1 task

hsliuustc0106 requested review from ZJY0516 and Copilot February 6, 2026 04:43

Copilot started reviewing on behalf of hsliuustc0106 February 6, 2026 04:43 View session

ZJY0516 approved these changes Feb 6, 2026

View reviewed changes

Copilot AI reviewed Feb 6, 2026

View reviewed changes

vllm_omni/diffusion/cache/cache_dit_backend.py Show resolved Hide resolved

vllm_omni/diffusion/cache/cache_dit_backend.py Show resolved Hide resolved

[feature]: support flux2.klein cache_dit

b626856

Signed-off-by: wuzhongjian <wuzhongjian_yewu@cmss.chinamobile.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature]: support flux2.klein cache_dit#1209

[feature]: support flux2.klein cache_dit#1209
nuclearwu wants to merge 3 commits intovllm-project:mainfrom
nuclearwu:klein

nuclearwu commented Feb 5, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

nuclearwu commented Feb 6, 2026

Uh oh!

ZJY0516 left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nuclearwu commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

nuclearwu commented Feb 6, 2026

Uh oh!

ZJY0516 left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nuclearwu commented Feb 5, 2026 •

edited

Loading