[Feature] Support Wan2.2 T2V and I2V Online Serving with OpenAI /v1/videos API by SamitHuang · Pull Request #1073 · vllm-project/vllm-omni

SamitHuang · 2026-01-29T09:28:17Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Support Wan2.2 T2V and I2V Online Serving
Add OpenAI-style T2V and I2V generation API for Wan2.2, which can be reused by other text-to-video models

New APIs

POST /v1/videos

OpenAI-style video generation endpoint.

Main Logic

The handler maps request fields to OmniDiffusionSamplingParams, routes to
the correct execution backend, extracts the video output, and encodes MP4
to base64.

Client
  |
  | POST /v1/videos (multipart)
  v
APIServer
  |
  v
OmniOpenAIServingVideo
  |
  v
OmniDiffusionSamplingParams
  |
  v
Backend Router
  |----------------------|
  |                      |
  v                      v
AsyncOmniDiffusion    AsyncOmni
  |                      |
  v                      v
DiffusionEngine      DiffusionEngine   (Wan2.2 T2V or I2V)
  |                      |
  v                      v
OmniRequestOutput    OmniRequestOutput
  \______________________/
             |
             v
      encode_video_base64
             |
             v
   VideoGenerationResponse
             |
             v
           Client

When a t2v or i2v request arrives:

For i2v, decode input_reference (reference image) and attach it to multi_modal_data.image.
Parse request and assemble OmniDiffusionSamplingParams.
Determine backend:
- Pure diffusion: single diffusion stage; use AsyncOmni or
  AsyncOmniDiffusion depending on server configuration.
- Multi-stage: build sampling_params_list aligned with stage types.
Extract video outputs from OmniRequestOutput.
Encode MP4 with diffusers.utils.export_to_video.
Return VideoGenerationResponse with b64_json.

Main Changes

Protocol schema: vllm_omni/entrypoints/openai/protocol/videos.py
API utils: vllm_omni/entrypoints/openai/video_api_utils.py
Handler: vllm_omni/entrypoints/openai/serving_video.py
Routing: vllm_omni/entrypoints/openai/api_server.py
Compatibility: vllm_omni/entrypoints/async_omni.py
Example:
- examples/online_serving/text_to_video/run_curl_text_to_video.sh
- examples/online_serving/image_to_video/run_curl_image_to_video.sh

Test Plan

T2V

Lauch the server

vllm serve Wan-AI/Wan2.2-T2V-A14B-Diffusers --omni --port 8091

Send request via curl

curl -X POST http://localhost:8091/v1/videos \
  -H "Accept: application/json" \
  -F "prompt=Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." \
  -F "size=832x480" \
  -F "num_frames=33" \
  -F "fps=16" \
  -F "negative_prompt=色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走" \
  -F "num_inference_steps=40" \
  -F "guidance_scale=4.0" \
  -F "guidance_scale_2=4.0" \
  -F "boundary_ratio=0.875" \
  -F "flow_shift=5.0" \
  -F "seed=42" | jq -r '.data[0].b64_json' | base64 -d > "wan22_t2v_output.mp4"

I2V

Launch the server

vllm serve Wan-AI/Wan2.2-I2V-A14B-Diffusers --omni --port 8091 --boundary-ratio 0.875 --flow-shift 12.0

Send request via curl (multipart)

curl -X POST http://localhost:8091/v1/videos \
  -H "Accept: application/json" \
  -F "prompt=A bear playing with yarn, smooth motion" \
  -F "negative_prompt=low quality, blurry, static" \
  -F "input_reference=@examples/offline_inference/image_to_video/qwen-bear.png" \
  -F "size=832x480" \
  -F "num_frames=33" \
  -F "fps=16" \
  -F "num_inference_steps=40" \
  -F "guidance_scale=1.0" \
  -F "guidance_scale_2=1.0" \
  -F "boundary_ratio=0.875" \
  -F "flow_shift=12.0" \
  -F "seed=42" | jq -r '.data[0].b64_json' | base64 -d > "wan22_i2v_output.mp4"

Test Result

T2V

wan22_output.mp4

I2V

wan22_i2v_output.mp4

Future consideration

Async processing for video generation, storage, and retrieval
video streaming output

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Introduce a video generations API with an extensible request schema and shared diffusion routing so Wan2.2 and future video models can be served consistently. Signed-off-by: samithuang <285365963@qq.com>

Signed-off-by: samithuang <285365963@qq.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b170ef4fc5

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm_omni/entrypoints/openai/serving_video.py

vllm_omni/entrypoints/openai/video_api_utils.py

Signed-off-by: samithuang <285365963@qq.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4a5b024bd5

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-01-29T10:09:37Z

vllm_omni/entrypoints/openai/video_api_utils.py

+    if video_tensor.is_floating_point():
+        video_tensor = video_tensor.clamp(-1, 1) * 0.5 + 0.5
+    video_array = video_tensor.float().numpy()
+    return _normalize_single_video_array(video_array)


Normalize uint8 tensors before float cast

If a model returns video frames as a uint8 torch tensor (0–255), _normalize_video_tensor casts to float before calling _normalize_single_video_array. That skips the integer-scaling path and instead clamps values to [-1, 1], turning most pixels into 1.0 (washed‑out/white frames). Handle integer tensors before the float cast (e.g., scale by 255 or preserve dtype) so post‑processed uint8 outputs encode correctly.

Useful? React with 👍 / 👎.

david6666666 · 2026-01-29T11:25:10Z

Does this PR support text-to-video and use the same endpoint?

SamitHuang · 2026-01-29T13:17:24Z

Does this PR support text-to-video and use the same endpoint?

it supports other T2V models with the same geneneration endpoint

david6666666 · 2026-01-29T13:23:01Z

Does this PR support text-to-video and use the same endpoint?

it supports other T2V models with the same geneneration endpoint

Sorry, image-to-video does this pr support ed？

SamitHuang · 2026-01-30T02:56:23Z

Sorry, image-to-video does this pr support ed？

not currently

Bounty-hunter · 2026-02-03T06:27:06Z

should also support /v1/videos? https://platform.openai.com/docs/api-reference/videos/create , which is multipart/form-data

Signed-off-by: samithuang <285365963@qq.com>

david6666666 · 2026-02-06T02:14:56Z

should also support /v1/videos? https://platform.openai.com/docs/api-reference/videos/create , which is multipart/form-data

I think we should follow this openai api endpoint @SamitHuang WDYT

Signed-off-by: samithuang <285365963@qq.com>

Signed-off-by: Samit <285365963@qq.com>

Signed-off-by: samithuang <285365963@qq.com>

SamitHuang · 2026-02-06T09:45:43Z

should also support /v1/videos? https://platform.openai.com/docs/api-reference/videos/create , which is multipart/form-data

I think we should follow this openai api endpoint @SamitHuang WDYT

agree, i have updated accordingly

Signed-off-by: samithuang <285365963@qq.com>

SamitHuang added 4 commits January 29, 2026 04:17

[Feature] Add OpenAI-style video generation endpoint

e850480

Introduce a video generations API with an extensible request schema and shared diffusion routing so Wan2.2 and future video models can be served consistently. Signed-off-by: samithuang <285365963@qq.com>

fix boundary_ratio parsing

e4c02fb

Signed-off-by: samithuang <285365963@qq.com>

add t2v online example doc

1a29a5d

Signed-off-by: samithuang <285365963@qq.com>

rm doc

b170ef4

Signed-off-by: samithuang <285365963@qq.com>

SamitHuang requested a review from hsliuustc0106 as a code owner January 29, 2026 09:28

SamitHuang marked this pull request as draft January 29, 2026 09:28

chatgpt-codex-connector bot reviewed Jan 29, 2026

View reviewed changes

vllm_omni/entrypoints/openai/serving_video.py Outdated Show resolved Hide resolved

vllm_omni/entrypoints/openai/video_api_utils.py Outdated Show resolved Hide resolved

SamitHuang added 2 commits January 29, 2026 09:40

rm sampling_params

a10b92a

Signed-off-by: samithuang <285365963@qq.com>

fix uuid

4a5b024

Signed-off-by: samithuang <285365963@qq.com>

SamitHuang marked this pull request as ready for review January 29, 2026 10:01

chatgpt-codex-connector bot reviewed Jan 29, 2026

View reviewed changes

This was referenced Feb 2, 2026

[Bug]: Diffusion chat completion failed: 'numpy.ndarray' object has no attribute 'save' #793

Open

[RFC]: Diffusion Models Features Supports Plan #814

Open

support i2v online serving

1c477f6

Signed-off-by: samithuang <285365963@qq.com>

fhfuih mentioned this pull request Feb 4, 2026

[ComfyUI]: ComfyUI integration #1113

Open

14 tasks

This was referenced Feb 4, 2026

[RFC]: Core Diffusion Features Support Plan JiusiServe/vllm-omni#103

Closed

[RFC]: Video Online Serving JiusiServe/vllm-omni#107

Open

add design doc

ff8b89d

Signed-off-by: samithuang <285365963@qq.com>

SamitHuang changed the title ~~[Feature] Support Wan2.2 T2V Online Serving~~ [Feature] Support Wan2.2 T2V and I2V Online Serving with OpenAI /v1/videos API Feb 6, 2026

SamitHuang added 5 commits February 6, 2026 16:56

Merge branch 'main' into wan22_online

b008acd

Signed-off-by: Samit <285365963@qq.com>

align t2v api

6b243f2

Signed-off-by: samithuang <285365963@qq.com>

wan default boundary_ratio update

d7700f5

Signed-off-by: samithuang <285365963@qq.com>

revert api_server

80f9523

Signed-off-by: samithuang <285365963@qq.com>

linting

12049f6

Signed-off-by: samithuang <285365963@qq.com>

SamitHuang added 2 commits February 6, 2026 09:50

Merge branch 'main' into wan22_online

72c03f6

rebase to main

68ce8d5

Signed-off-by: samithuang <285365963@qq.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support Wan2.2 T2V and I2V Online Serving with OpenAI /v1/videos API#1073

[Feature] Support Wan2.2 T2V and I2V Online Serving with OpenAI /v1/videos API#1073
SamitHuang wants to merge 15 commits intovllm-project:mainfrom
SamitHuang:wan22_online

SamitHuang commented Jan 29, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Jan 29, 2026

Uh oh!

david6666666 commented Jan 29, 2026

Uh oh!

SamitHuang commented Jan 29, 2026

Uh oh!

david6666666 commented Jan 29, 2026

Uh oh!

SamitHuang commented Jan 30, 2026

Uh oh!

Bounty-hunter commented Feb 3, 2026 •

edited

Loading

Uh oh!

david6666666 commented Feb 6, 2026

Uh oh!

SamitHuang commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

SamitHuang commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

New APIs

Main Logic

Main Changes

Test Plan

T2V

I2V

Test Result

T2V

I2V

Future consideration

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

david6666666 commented Jan 29, 2026

Uh oh!

SamitHuang commented Jan 29, 2026

Uh oh!

david6666666 commented Jan 29, 2026

Uh oh!

SamitHuang commented Jan 30, 2026

Uh oh!

Bounty-hunter commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david6666666 commented Feb 6, 2026

Uh oh!

SamitHuang commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SamitHuang commented Jan 29, 2026 •

edited

Loading

Bounty-hunter commented Feb 3, 2026 •

edited

Loading