-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add Intel nightly tests for XPU and CPU platforms
deepseek
#22677
opened Apr 13, 2026 by
MingxuZh
Contributor
Loading…
[NPU] Support Qwen3.5-MoE and Qwen3-Next quantization
#22674
opened Apr 13, 2026 by
Dmovic
Loading…
5 tasks
[Perf] Precompute gemma_weight to avoid redundant add on every forward
#22673
opened Apr 13, 2026 by
Chen-0210
Contributor
Loading…
5 tasks
reland [Diffusion] Add FLUX.1-dev ModelOpt NVFP4 support
blackwell
SM100/SM120
diffusion
SGLang Diffusion
documentation
Improvements or additions to documentation
jit-kernel
quant
LLM Quantization
run-ci
#22672
opened Apr 13, 2026 by
BBuf
Collaborator
Loading…
feat: Support flashinfer_cutedsl MoE runner with flashinfer alltoall backend
#22669
opened Apr 13, 2026 by
samuellees
Contributor
Loading…
5 tasks done
[fix] resolve negative dimension when creating tensor in multi-layer MTP
#22666
opened Apr 13, 2026 by
fromck
Loading…
5 tasks
Qwen3next flashinfer allreduce auto enable
#22664
opened Apr 13, 2026 by
BBuf
Collaborator
Loading…
5 tasks
[Hotfix] final fixes for P2P Transfer
deepseek
#22663
opened Apr 13, 2026 by
JD-ETH
Contributor
Loading…
[VLM] Reduce GPU memory footprint of CUDA IPC MM feature transport
run-ci
#22662
opened Apr 13, 2026 by
yhyang201
Collaborator
Loading…
5 tasks
Fix/amd wheel jit kernel support
dependencies
Pull requests that update a dependency file
documentation
Improvements or additions to documentation
#22661
opened Apr 13, 2026 by
akao-amd
Contributor
Loading…
5 tasks
Skip redundant moe_sum_reduce for single-expert routing on XPU
#22660
opened Apr 13, 2026 by
rahulvijayaraghavan
Contributor
Loading…
Add sleep/wake support for diffusion engine
diffusion
SGLang Diffusion
documentation
Improvements or additions to documentation
PD streaming: batch notify + SSE fast path
run-ci
#22658
opened Apr 13, 2026 by
inkcherry
Contributor
Loading…
5 tasks
[XPU] Support apply_router_weight_on_input for Llama4 for fused_experts
quant
LLM Quantization
#22654
opened Apr 13, 2026 by
rahulvijayaraghavan
Contributor
Loading…
enable streaming session retract tests
#22651
opened Apr 13, 2026 by
hnyls2002
Collaborator
Loading…
1 task
env: add knob to control SWA eviction interval
#22645
opened Apr 13, 2026 by
happierpig
Contributor
Loading…
5 tasks
[sgl] update specdec sampling kernel to return valid token ID
sgl-kernel
speculative-decoding
#22643
opened Apr 12, 2026 by
2022tgoel
Contributor
Loading…
5 tasks done
Replace all-reduce + dp_scatter with reduce_scatterv for DP attention
run-ci
#22642
opened Apr 12, 2026 by
YAMY1234
Contributor
Loading…
3 of 5 tasks
Add HybridModel support to HiCacheHF3FS storage backend
#22641
opened Apr 12, 2026 by
guanwei-wu
•
Draft
1 of 5 tasks
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.