Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Renderer] Remove InputPreprocessor ready ONLY add when PR is ready to merge/full CI is needed v1
#38688 opened Apr 1, 2026 by DarkLight1337 Loading…
5 tasks
[Perf] DSV3.2 Indexer Fused Weights Projection deepseek Related to DeepSeek models
#38684 opened Apr 1, 2026 by benchislett Loading…
[Quantization] Rename mxfp4 quant layer and oracle to gpt_oss_mxfp4 gpt-oss Related to GPT-OSS models
#38683 opened Apr 1, 2026 by zyongye Loading…
3 of 5 tasks
[CPU] Fix lscpu NUMA node regex to handle quoted - and null in containers cpu Related to CPU backends
#38681 opened Apr 1, 2026 by Monokaix Loading…
3 of 5 tasks
[CI][ROCm] Remove unsupported cases in test_fusion.py rocm Related to AMD ROCm
#38680 opened Apr 1, 2026 by charlifu Loading…
fused_moe_kernel opt
#38679 opened Apr 1, 2026 by SYChen123 Loading…
5 tasks
[CPU] Support head_size 512 in cpu_attn cpu Related to CPU backends documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed v1
#38676 opened Apr 1, 2026 by bigPYJ1151 Loading…
5 tasks
[Bugfix] Preserve original ImportError in gRPC server entrypoint bug Something isn't working frontend
#38673 opened Apr 1, 2026 by CatherineSue Loading…
3 tasks done
[Bugfix] Fix AWQ models batch invariance issues bug Something isn't working v1
#38670 opened Apr 1, 2026 by YM2132 Loading…
5 tasks done
[ROCm] Enable dual-stream MoE shared experts and GLM-5 MXFP4 Quark support rocm Related to AMD ROCm v1
#38665 opened Mar 31, 2026 by ChuanLi1101 Loading…
4 tasks
[CI][ROCm] Add Qwen3.5-35B-A3B-MXFP4 model eval into CI qwen Related to Qwen models rocm Related to AMD ROCm
#38664 opened Mar 31, 2026 by BowenBao Draft
[Core][Feat][ safely abort requests where FSM failed to advance v1
#38663 opened Mar 31, 2026 by walterbm Loading…
3 of 5 tasks
[2/N] Pass model_config to the Attention constructors deepseek Related to DeepSeek models gpt-oss Related to GPT-OSS models llama Related to Llama models qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed
#38661 opened Mar 31, 2026 by MatthewBonanni Loading…
3 of 5 tasks
[compile] Invoke split FX graph by codegen.
#38657 opened Mar 31, 2026 by zhxchen17 Loading…
5 tasks
Fix Nano Nemotron VL regressions multi-modality Related to multi-modality (#4194)
#38655 opened Mar 31, 2026 by netanel-haber Loading…
[Bugfix] Fix vllm bench serve to count multimodal tokens in "total input tokens" bug Something isn't working performance Performance-related issues
#38654 opened Mar 31, 2026 by mgehre-amd Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.