Skip to content

v0.15.0rc1

Pre-release
Pre-release

Choose a tag to compare

@david6666666 david6666666 released this 03 Feb 09:23
d6f93b0

This pre-release is a alignment to the upstream vLLM v0.15.0.

Highlights

  • Rebase to Upstream vLLM v0.15.0: vLLM-Omni is now fully aligned with the latest vLLM v0.15.0 core, bringing in all the latest upstream features, bug fixes, and performance improvements (#1159).
  • Tensor Parallelism for LongCat-Image: We have added Tensor Parallelism (TP) support for LongCat-Image and LongCat-Image-Edit models, significantly improving the inference speed and scalability of these vision-language models (#926).
  • TeaCache Optimization: Introduced Coefficient Estimation for TeaCache, further refining the efficiency of our caching mechanisms for optimized generation (#940).
  • Alignment & Stability:
    • Enhanced error handling logic to maintain consistency with upstream vLLM v0.14.0/v0.15.0 standards (#1122).
    • Integrated "Bagel" E2E Smoke Tests and refactored sequence parallel tests to ensure robust CI/CD and accurate performance benchmarking (#1074, #1165).
  • Update paper link: A intial paper to arxiv to give introductions to our design and some performance test results (#1169).

What's Changed

Features & Optimizations

Alignment & Integration

Infrastructure (CI/CD) & Documentation

New Contributors

Full Changelog: v0.14.0...v0.15.0rc1