Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Misc] downgrade nvidia-cutlass-dsl to 4.5.0 ready ONLY add when PR is ready to merge/full CI is needed
#43230 opened May 20, 2026 by ZJY0516 Member Loading…
4 tasks
[CompressedTensors] FP4 Qutlass Integration
#43229 opened May 20, 2026 by kylesayrs Contributor Draft
[Bugfix][Gemma4MoE] Fix AutoRound quantized Gemma4 MoE loading bug Something isn't working
#43227 opened May 20, 2026 by wxwxwwxxx Loading…
[CPU] Experimentally enable Triton and MRV2 ci/build cpu Related to CPU backends v1
#43225 opened May 20, 2026 by bigPYJ1151 Member Draft
4 tasks
Fix FlashInfer TRTLLM NvFP4 monolithic MoE routing nvidia
#43223 opened May 20, 2026 by zhangxin81 Contributor Loading…
[EPLB] Make async EPLB default ci/build documentation Improvements or additions to documentation
#43219 opened May 20, 2026 by ilmarkov Contributor Loading…
4 tasks
[ROCm] MoRI connector telemetry kv-connector rocm Related to AMD ROCm
#43218 opened May 20, 2026 by simondanielsson Contributor Draft
4 tasks
[Misc] Add exponential distribution to multi-turn benchmark performance Performance-related issues
#43217 opened May 20, 2026 by nikonyrh-siloai Loading…
4 tasks done
[Misc] Add --max-duration-sec to benchmark_serving_multi_turn.py performance Performance-related issues
#43215 opened May 20, 2026 by nikonyrh-siloai Loading…
3 of 4 tasks
[Model] Fix MiniCPM-V 4.6 vit_merger qkv weight loading
#43213 opened May 20, 2026 by tc-mb Contributor Loading…
[Bugfix] Fix multi-turn benchmark's sleep to match the configured request rate bug Something isn't working performance Performance-related issues
#43212 opened May 20, 2026 by nikonyrh-siloai Loading…
3 of 4 tasks
[Bugfix][Reasoning] Properly detect reasoning end when using thinking_token_budget bug Something isn't working v1
#43210 opened May 20, 2026 by schoennenbeck Contributor Loading…
[Docs] Add drain shutdown section to Kubernetes deployment guide documentation Improvements or additions to documentation
#43208 opened May 20, 2026 by markmc Member Loading…
[KV Offload] Add get_request_offloading_context lifecycle hook kv-connector v1
#43205 opened May 20, 2026 by ronensc Contributor Loading…
4 tasks
[Cleanup]Simplify UnitaryKVCacheCoordinator hash_block_size assert v1
#43204 opened May 20, 2026 by maang-h Contributor Loading…
[vLLM IR][Rope] Port RotaryEmbedding to IR Ops cpu Related to CPU backends intel-gpu Related to Intel GPU nvidia rocm Related to AMD ROCm
#43199 opened May 20, 2026 by wxsIcey Contributor Loading…
4 tasks
【Feature】Modify the fps parameter when loading the multimodal model Video. bug Something isn't working multi-modality Related to multi-modality (#4194)
#43198 opened May 20, 2026 by lucky-dep Loading…
[CI] De-flake test_models for bigscience/bloom-560m ready ONLY add when PR is ready to merge/full CI is needed
#43197 opened May 20, 2026 by haosdent Contributor Loading…
ProTip! Updated in the last three days: updated:>2026-05-17.