Bump vllm from 0.11.2 to 0.15.1 #435

dependabot · 2026-02-05T22:24:06Z

Bumps vllm from 0.11.2 to 0.15.1.

Release notes

v0.15.1 is a patch release with security fixes, RTX Blackwell GPU fixes support, and bug fixes.

Security

CVE-2025-69223: Updated aiohttp dependency (#33621)

CVE-2026-0994: Updated Protobuf dependency (#33619)

Highlights

Bugfix Hardware Support

RTX Blackwell (SM120): Fixed NVFP4 MoE kernel support for RTX Blackwell workstation GPUs. Previously, NVFP4 MoE models would fail to load on these GPUs (#33417)

FP8 kernel selection: Fixed FP8 CUTLASS group GEMM to properly fall back to Triton kernels on SM120 GPUs (#33285)

Model Support

Step-3.5-Flash: New model support (#33523)

Bugfix Model Support

Qwen3-VL-Reranker: Fixed model loading (#33298)

Whisper: Fixed FlashAttention2 with full CUDA graphs (#33360)

Performance

torch.compile cold-start: Fixed regression that increased cold-start compilation time (Llama3-70B: ~88s → ~22s) (#33441)

MoE forward pass: Optimized by caching layer name computation (#33184)

Bug Fixes

Fixed prefix cache hit rate of 0% with GPT-OSS style hybrid attention models (#33524)

Enabled Triton MoE backend for FP8 per-tensor dynamic quantization (#33300)

Disabled unsupported Renormalize routing methods for TRTLLM per-tensor FP8 MoE (#33620)

Fixed speculative decoding metrics crash when no tokens generated (#33729)

Disabled fast MoE cold start optimization with speculative decoding (#33624)

Fixed ROCm skinny GEMM dispatch logic (#33366)

Dependencies

Pinned LMCache >= v0.3.9 for API compatibility (#33440)

New Contributors 🎉

@zaristei2 made their first contribution in vllm-project/vllm#33621

Full Changelog: vllm-project/vllm@v0.15.0...v0.15.1

v0.15.0

Highlights

This release features 335 commits from 158 contributors (39 new)!

Model Support

New architectures: Kimi-K2.5 (#33131), Molmo2 (#30997), Step3vl 10B (#32329), Step1 (#32511), GLM-Lite (#31386), Eagle2.5-8B VLM (#32456).

LoRA expansion: Nemotron-H (#30802), InternVL2 (#32397), MiniMax M2 (#32763).

Speculative decoding: EAGLE3 for Pixtral/LlavaForConditionalGeneration (#32542), Qwen3 VL MoE (#32048), draft model support (#24322).

Embeddings: BGE-M3 sparse embeddings and ColBERT embeddings (#14526).

... (truncated)

Commits

1892993 [BugFix][Spec Decoding] Fix negative accepted tokens metric crash (#33729)
7d98f09 cherry pick
daa2784 [Bugfix] Disable RoutingMethodType.[Renormalize,RenormalizeNaive] TRTLLM per-...
e4bf6ed [torch.compile] Don't do the fast moe cold start optimization if there is spe...
611b187 [torch.compile] Speed up MOE handling in forward_context (#33184)
eec3546 [Misc][Build] Lazy load cv2 in nemotron_parse.py (#33189)
7c023ba Patch Protobuf for CVE 2026-0994 (#33619)
099a787 Patch aiohttp for CVE-2025-69223 (#33621)
31a64c6 [Release] Fix format and cherry-pick (#33618)
57eae2f [Release] patch step3p5 attention class in v0.15.1 release (#33602)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [vllm](https://github.com/vllm-project/vllm) from 0.11.2 to 0.15.1. - [Release notes](https://github.com/vllm-project/vllm/releases) - [Changelog](https://github.com/vllm-project/vllm/blob/main/RELEASE.md) - [Commits](vllm-project/vllm@v0.11.2...v0.15.1) --- updated-dependencies: - dependency-name: vllm dependency-version: 0.15.1 dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>

dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Feb 5, 2026

dependabot bot mentioned this pull request Feb 5, 2026

Bump vllm from 0.11.2 to 0.15.0 #432

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump vllm from 0.11.2 to 0.15.1 #435

Bump vllm from 0.11.2 to 0.15.1 #435

Uh oh!

dependabot bot commented on behalf of github Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Bump vllm from 0.11.2 to 0.15.1 #435

Are you sure you want to change the base?

Bump vllm from 0.11.2 to 0.15.1 #435

Uh oh!

Conversation

dependabot bot commented on behalf of github Feb 5, 2026

Security

Highlights

Bugfix Hardware Support

Model Support

Bugfix Model Support

Performance

Bug Fixes

Dependencies

New Contributors 🎉

v0.15.0

Highlights

Model Support

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants