Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

enable flashinfer moe kernel for DP + EP
#36838 opened Mar 12, 2026 by czhu-cohere Loading…
3 of 5 tasks
Add simple granite4 tool parser documentation Improvements or additions to documentation tool-calling
#36827 opened Mar 11, 2026 by maxdebayser Loading…
[Bugfix] Fix Qwen3.5 LoRA IndexError in packed_modules_mapping bug Something isn't working qwen Related to Qwen models
#36825 opened Mar 11, 2026 by hallerite Loading…
2 tasks done
[Model Runner V2] Do not initialize sampler for non-last PP ranks ready ONLY add when PR is ready to merge/full CI is needed v1
#36824 opened Mar 11, 2026 by WoosukKwon Loading…
[vLLM IR] 3/N fused_add_rms_norm and maybe_inplace nvidia torch.compile vllm-ir vLLM IR: intermediate representation and kernel registration
#36823 opened Mar 11, 2026 by ProExpertProg Draft
5 tasks
[BugFix] Fix multiple/duplicate stdout prefixes bug Something isn't working frontend ready ONLY add when PR is ready to merge/full CI is needed v1
#36822 opened Mar 11, 2026 by njhill Loading…
[Model] Add ColPali late interaction model for multi-modal retrieval documentation Improvements or additions to documentation multi-modality Related to multi-modality (#4194) new-model Requests to new models
#36818 opened Mar 11, 2026 by Kaonael Loading…
4 of 5 tasks
[Model Runner V2] Add Support for XD-RoPE nvidia v1
#36817 opened Mar 11, 2026 by santiramos27 Loading…
5 tasks
[vLLM IR] 2/N batch-invariant-aware dispatching and rms_norm vllm-ir vLLM IR: intermediate representation and kernel registration
#36816 opened Mar 11, 2026 by ProExpertProg Draft
5 tasks
[Tests] Skip model weight download for render-only test server
#36813 opened Mar 11, 2026 by sagearc Loading…
5 tasks
[Metrics] Temporary band-aid for "Counters can only be incremented by non-negative amounts" ready ONLY add when PR is ready to merge/full CI is needed v1
#36812 opened Mar 11, 2026 by markmc Loading…
[ROCm][Perf] Fused GEMM + static FP8 output quantization rocm Related to AMD ROCm
#36810 opened Mar 11, 2026 by andyluo7 Loading…
Support temporal compression for videos
#36808 opened Mar 11, 2026 by collinmccarthy Loading…
5 tasks
[Bugfix] Pad Marlin FP8 MoE weight dims to tile alignment under TP > 1 bug Something isn't working
#36807 opened Mar 11, 2026 by ssubhanjali Loading…
5 tasks
Only show FP4 Marlin fallback warning for w4a4 models ready ONLY add when PR is ready to merge/full CI is needed
#36806 opened Mar 11, 2026 by mgoin Loading…
[Test] E2E Nemotron-3-Super tests ci/build nvidia ready ONLY add when PR is ready to merge/full CI is needed
#36803 opened Mar 11, 2026 by roikoren755 Loading…
5 tasks
[Bugfix] Fix Qwen2.5-omni/Qwen3-omni mm_processor cache for audio_in_video request bug Something isn't working qwen Related to Qwen models
#36800 opened Mar 11, 2026 by Isotr0py Loading…
3 of 5 tasks
ProTip! Exclude everything labeled bug with -label:bug.