v0.15.0rc1

Pre-release

Pre-release

david6666666 released this 03 Feb 09:23

d6f93b0

This pre-release is a alignment to the upstream vLLM v0.15.0.

Highlights

Rebase to Upstream vLLM v0.15.0: vLLM-Omni is now fully aligned with the latest vLLM v0.15.0 core, bringing in all the latest upstream features, bug fixes, and performance improvements (#1159).
Tensor Parallelism for LongCat-Image: We have added Tensor Parallelism (TP) support for LongCat-Image and LongCat-Image-Edit models, significantly improving the inference speed and scalability of these vision-language models (#926).
TeaCache Optimization: Introduced Coefficient Estimation for TeaCache, further refining the efficiency of our caching mechanisms for optimized generation (#940).
Alignment & Stability:
- Enhanced error handling logic to maintain consistency with upstream vLLM v0.14.0/v0.15.0 standards (#1122).
- Integrated "Bagel" E2E Smoke Tests and refactored sequence parallel tests to ensure robust CI/CD and accurate performance benchmarking (#1074, #1165).
Update paper link: A intial paper to arxiv to give introductions to our design and some performance test results (#1169).

What's Changed

Features & Optimizations

[TeaCache]: Add Coefficient Estimation by @princepride in #940
[Feature] add Tensor Parallelism to LongCat-Image(-Edit) by @hadipash in #926

Alignment & Integration

Dev/rebase v0.15.0 by @tzhouam in #1159
[Misc] Align error handling with upstream vLLM v0.14.0 by @ceanna93 in #1122
[Misc] Bump version to 0.14.0 by @ywang96 in #1128

Infrastructure (CI/CD) & Documentation

[Doc] First stable release of vLLM-Omni by @ywang96 in #1129
[CI]: Bagel E2E Smoked Test by @princepride in #1074
[CI] Refactor test_sequence_parallel.py and add a warmup run by @mxuax in #1165
[CI] Temporarily remove slow tests. by @congw729 in #1143
[Debug] Clear Dockerfile.ci to accelerate build image by @tzhouam in #1172
[Debug] Correct Unreasonable Long Timeout by @tzhouam in #1175
[Docs] Update paper link by @hsliuustc0106 in #1169

New Contributors

@ceanna93 made their first contribution in #1122
@hadipash made their first contribution in #926

Full Changelog: v0.14.0...v0.15.0rc1

Contributors

ceanna93, hadipash, and 6 other contributors

Assets 2