Skip to content

[Doc] Adding models/pipelines/features Tutorial#1196

Open
wtomin wants to merge 43 commits intovllm-project:mainfrom
wtomin:doc-refine
Open

[Doc] Adding models/pipelines/features Tutorial#1196
wtomin wants to merge 43 commits intovllm-project:mainfrom
wtomin:doc-refine

Conversation

@wtomin
Copy link
Contributor

@wtomin wtomin commented Feb 4, 2026

Comments are welcomed! Suggested changes are welcomed!

Purpose

It is important to have clear, easy-to-follow, tutorials on how to adapt huggingface models/pipelines to vLLM-Omni, and support various features:

  • torch.compile
  • CFG Parallel
  • Tensor Parallel
  • Sequence Parallel
  • Cache acceleration: cache-dit and TeaCache
  • Quantization (coming soon, not included in this PR)
  • Patch VAE Parallel (coming soon, not included in this PR)

The HowToAdd tutorial should at least cover the following content:


# How to add [Feature Name] support for a new model

## Table of Contents

## Overview

### What is xx?
Concept explanation

### Architecture
Key APIs required for adapation

## Step-by-Step Implementation
## Customization (if required customized adaptation for special models)
Or for features with multiple implementation approaches:
## Approach 1: [Method Name]
## Approach 2: [Alternative Method Name]


## Testing
How to test the performance and quality

## Troubleshooting
Common issues and solutions

## Reference Implementations
Complete examples 

## Summary

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

@dongbo910220
Copy link
Contributor

This is a much-needed tutorial. Excited to see this land.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive tutorial documentation for adapting HuggingFace models/pipelines to vLLM-Omni and supporting various advanced features including parallelism strategies and cache acceleration.

Changes:

  • Reorganized parallelism documentation by moving detailed SP/CFG-Parallel content from parallelism_acceleration.md to dedicated feature-specific guides
  • Completely rewrote adding_diffusion_model.md with step-by-step instructions, examples, and troubleshooting
  • Added five new feature tutorial documents covering tensor parallel, CFG parallel, sequence parallel, TeaCache, and Cache-DiT
  • Updated navigation structure to include new "Advanced Features" section

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
docs/user_guide/diffusion/parallelism_acceleration.md Removed detailed implementation guides (moved to dedicated feature docs)
docs/contributing/model/adding_diffusion_model.md Complete rewrite with comprehensive step-by-step guide for adding diffusion models
docs/contributing/features/tensor_parallel.md New guide for adding Tensor Parallel support to transformers
docs/contributing/features/teacache.md New guide for adding TeaCache acceleration support
docs/contributing/features/sequence_parallel.md New guide for adding Sequence Parallel support (moved from parallelism_acceleration.md)
docs/contributing/features/cfg_parallel.md New guide for adding CFG-Parallel support to pipelines (moved from parallelism_acceleration.md)
docs/contributing/features/cache_dit.md New guide for adding Cache-DiT acceleration support
docs/.nav.yml Added "Advanced Features" navigation section with links to new feature guides

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


---

### Step 4: Add Example Script
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add example and documentation? executable Python examples are at examples/(offline or online)/(task and modality)/*.{py|sh}, and documentations are at examples/(offline or online)/(task and modality)/*.md and docs/user_guide/examples/(offline or online)/*.md

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is different from the supported_models and ...acceleration markdowns you have added below.

@wtomin
Copy link
Contributor Author

wtomin commented Feb 5, 2026

@dongbo910220 Comments are welcomed! I think you maybe interested in sp document.

Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
@wtomin
Copy link
Contributor Author

wtomin commented Feb 5, 2026

@hadipash Hello, I think you have worked on Tensor Parallel support for diffusion models. Can you give your comments?

@wtomin wtomin marked this pull request as ready for review February 5, 2026 02:31
@wtomin
Copy link
Contributor Author

wtomin commented Feb 5, 2026

@mxuax @ZJY0516 @SamitHuang Please leave your comments or suggested changes. Thank you very much!

@ZJY0516
Copy link
Collaborator

ZJY0516 commented Feb 5, 2026

I suggest you take a look at the generated webpage — it may have some rendering issues. For example:

image

wtomin and others added 27 commits February 6, 2026 10:17
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Co-authored-by: dongbo910220 <32610838+dongbo910220@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
Signed-off-by: Didan Deng <33117903+wtomin@users.noreply.github.com>
@hadipash
Copy link
Contributor

hadipash commented Feb 6, 2026

LGTM. Very clear docs 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants