-
Notifications
You must be signed in to change notification settings - Fork 119
Pull requests: vllm-project/tpu-inference
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Consolidating quantized blockwise kernel and quantized matmul kernel
#1905
opened Mar 11, 2026 by
jyj0w0
Loading…
Fix: Align Qwen3 dtype handling with Qwen2 for TPU
#1904
opened Mar 11, 2026 by
xianglon-commits
Loading…
Temp fix pp
ready
ONLY add when PR is ready to merge/full CI is needed
#1903
opened Mar 11, 2026 by
pv97
Loading…
add dummy variable for logprob to avoid cache collision
ready
ONLY add when PR is ready to merge/full CI is needed
#1895
opened Mar 10, 2026 by
sixiang-google
Loading…
Correctly trigger node limited routing in DeepSeek-V3 JAX path.
ready
ONLY add when PR is ready to merge/full CI is needed
#1891
opened Mar 10, 2026 by
gpolovets1
Loading…
Remove redundant Python script formatting from README generation
#1879
opened Mar 6, 2026 by
RobMulla
Loading…
Format support matrix statuses natively in buildkite shell scripts
#1877
opened Mar 6, 2026 by
RobMulla
Loading…
Add MaxText post-training tests to CI
ready
ONLY add when PR is ready to merge/full CI is needed
#1876
opened Mar 6, 2026 by
SurbhiJainUSC
•
Draft
test upload unit test duration artifact in nightly build
ready
ONLY add when PR is ready to merge/full CI is needed
#1875
opened Mar 6, 2026 by
ernie-chang
•
Draft
[Spec Decoding] Add DFlash e2e tests and Buildkite CI
#1870
opened Mar 5, 2026 by
aaronzhfeng
Loading…
3 tasks done
[Spec Decoding] Integrate DFlash into speculative decoding pipeline
#1869
opened Mar 5, 2026 by
aaronzhfeng
Loading…
3 tasks done
[Spec Decoding] Add DFlash model and proposer
ready
ONLY add when PR is ready to merge/full CI is needed
#1868
opened Mar 5, 2026 by
aaronzhfeng
Loading…
3 tasks done
[Jax][Deepseek] Updated sharding scales to align with GMM_TP settings.
ready
ONLY add when PR is ready to merge/full CI is needed
#1865
opened Mar 5, 2026 by
gpolovets1
Loading…
Skip outer JIT for models that self-manage sharding
#1861
opened Mar 5, 2026 by
khatwanimohit
Loading…
Used condition variable for event polling
ready
ONLY add when PR is ready to merge/full CI is needed
#1857
opened Mar 4, 2026 by
datenglin
Loading…
MLA KV Cache Fusion TPU Inference
ready
ONLY add when PR is ready to merge/full CI is needed
#1856
opened Mar 4, 2026 by
mourado
Loading…
Move FP8 MoE weight requantization from CPU to TPU
ready
ONLY add when PR is ready to merge/full CI is needed
#1842
opened Mar 3, 2026 by
rohan-reddy
Loading…
7 of 8 tasks
Fix recompilation check code
ready
ONLY add when PR is ready to merge/full CI is needed
#1839
opened Mar 3, 2026 by
kyuyeunk
Loading…
Sample commit to debug batch split all reduce offload
#1834
opened Mar 2, 2026 by
rupengliu-meta
Loading…
[Qwen 3] promote_dtype_for_stats flag to optionally casting RMSNorm to FP32
#1825
opened Feb 28, 2026 by
JiriesKaileh
Loading…
[RPA3] Separate kernel to 3 calls and many optimizations
ready
ONLY add when PR is ready to merge/full CI is needed
#1820
opened Feb 27, 2026 by
bythew3i
Loading…
feat: End-to-End Automation MVP (Matrices, Docs UI, & Community)
#1813
opened Feb 26, 2026 by
ica-chao
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.