Skip to content

Pull requests: vllm-project/tpu-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add pre-quantized FP4 MoE weight loading
#1906 opened Mar 11, 2026 by bgchun-fs Loading…
Temp fix pp ready ONLY add when PR is ready to merge/full CI is needed
#1903 opened Mar 11, 2026 by pv97 Loading…
test fix for tp
#1902 opened Mar 10, 2026 by pv97 Draft
add dummy variable for logprob to avoid cache collision ready ONLY add when PR is ready to merge/full CI is needed
#1895 opened Mar 10, 2026 by sixiang-google Loading…
Correctly trigger node limited routing in DeepSeek-V3 JAX path. ready ONLY add when PR is ready to merge/full CI is needed
#1891 opened Mar 10, 2026 by gpolovets1 Loading…
Fix multihost pp errors
#1890 opened Mar 9, 2026 by pv97 Loading…
Add MaxText post-training tests to CI ready ONLY add when PR is ready to merge/full CI is needed
#1876 opened Mar 6, 2026 by SurbhiJainUSC Draft
test upload unit test duration artifact in nightly build ready ONLY add when PR is ready to merge/full CI is needed
#1875 opened Mar 6, 2026 by ernie-chang Draft
[Spec Decoding] Add DFlash e2e tests and Buildkite CI
#1870 opened Mar 5, 2026 by aaronzhfeng Loading…
3 tasks done
[Spec Decoding] Integrate DFlash into speculative decoding pipeline
#1869 opened Mar 5, 2026 by aaronzhfeng Loading…
3 tasks done
[Spec Decoding] Add DFlash model and proposer ready ONLY add when PR is ready to merge/full CI is needed
#1868 opened Mar 5, 2026 by aaronzhfeng Loading…
3 tasks done
[Jax][Deepseek] Updated sharding scales to align with GMM_TP settings. ready ONLY add when PR is ready to merge/full CI is needed
#1865 opened Mar 5, 2026 by gpolovets1 Loading…
Used condition variable for event polling ready ONLY add when PR is ready to merge/full CI is needed
#1857 opened Mar 4, 2026 by datenglin Loading…
MLA KV Cache Fusion TPU Inference ready ONLY add when PR is ready to merge/full CI is needed
#1856 opened Mar 4, 2026 by mourado Loading…
Move FP8 MoE weight requantization from CPU to TPU ready ONLY add when PR is ready to merge/full CI is needed
#1842 opened Mar 3, 2026 by rohan-reddy Loading…
7 of 8 tasks
Fix recompilation check code ready ONLY add when PR is ready to merge/full CI is needed
#1839 opened Mar 3, 2026 by kyuyeunk Loading…
[RPA3] Separate kernel to 3 calls and many optimizations ready ONLY add when PR is ready to merge/full CI is needed
#1820 opened Feb 27, 2026 by bythew3i Loading…
ProTip! Add no:assignee to see everything that’s not assigned.