Skip to content

Pull requests: vllm-project/tpu-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add ragged_conv1d for ragged chunked kernel integration for gdn attention ready ONLY add when PR is ready to merge/full CI is needed
#2099 opened Mar 31, 2026 by helloworld1 Loading…
[Environment variable override] Override VLLM_USE_AOT_COMPILE to False by default ready ONLY add when PR is ready to merge/full CI is needed
#2097 opened Mar 31, 2026 by jrplatin Loading…
Fix fork PR permissions
#2094 opened Mar 31, 2026 by ylangtsou Loading…
Call _align_hybrid_block_size in TpuPlatform ready ONLY add when PR is ready to merge/full CI is needed
#2090 opened Mar 31, 2026 by ShobhitBehl Loading…
[Draft] Add Qwen3.5 to CI/CD
#2086 opened Mar 30, 2026 by jrplatin Draft
Set prompt_token_ids_cpu=None to match upstream interface change ready ONLY add when PR is ready to merge/full CI is needed
#2085 opened Mar 30, 2026 by pv97 Loading…
Pipeline psum and sc kernel
#2083 opened Mar 30, 2026 by clee1994 Loading…
env fix
#2077 opened Mar 28, 2026 by clee1994 Loading…
Fix compressed tensors moe test. ready ONLY add when PR is ready to merge/full CI is needed
#2076 opened Mar 28, 2026 by dmmolitor Loading…
Reorganization of ToC
#2064 opened Mar 27, 2026 by mtsokol Draft
test
#2061 opened Mar 27, 2026 by meiyeh123 Draft
Do not submit. ready ONLY add when PR is ready to merge/full CI is needed
#2053 opened Mar 26, 2026 by QiliangCui Loading…
transcedentals cost in cost estimate ready ONLY add when PR is ready to merge/full CI is needed
#2044 opened Mar 26, 2026 by coolkp Loading…
Wire AWQ dense layers to use GMM V2 kernel for W4A16 matmul
#2038 opened Mar 26, 2026 by rohan-reddy Loading…
5 tasks done
Hybrid MoE, combining EP and TP
#2001 opened Mar 23, 2026 by zhangamy-crypto Loading…
Fix #957: Support video input for Qwen2.5-VL
#1992 opened Mar 21, 2026 by codeXsidd Loading…
[Parallelism Support Matrix Tests] Replace flaky EP relative comparison with hardcoded absolute baseline ready ONLY add when PR is ready to merge/full CI is needed
#1984 opened Mar 20, 2026 by syhuang22 Loading…
[Fused MoE] Use jax.nn.sigmoid ready ONLY add when PR is ready to merge/full CI is needed
#1980 opened Mar 20, 2026 by catswe Loading…
round kv cache to allow expert sharding
#1979 opened Mar 20, 2026 by khatwanimohit Loading…
[WIP] RPA KV update across bq
#1978 opened Mar 19, 2026 by helloworld1 Draft
reorg support matrices for UX documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed
#1975 opened Mar 19, 2026 by jcyang43 Loading…
ProTip! Filter pull requests by the default branch with base:main.