-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
perf: sort prefill requests by lora_id in scheduler
#20582
opened Mar 14, 2026 by
satyamk7054
•
Draft
2 of 5 tasks
[Model] Fix NemotronH OOM on unified-mem systems: stream weights + safetensors cleanup
run-ci
#20580
opened Mar 14, 2026 by
Kh4L
Loading…
[Security] Replace eval/pickle usage in hicache benchmark data processing
#20579
opened Mar 14, 2026 by
jszzr
Loading…
feat: support human-readable suffixes (1k, 1M) for token CLI
run-ci
#20577
opened Mar 14, 2026 by
alphabetc1
Loading…
1 of 5 tasks
[Diffusion] Clean upstream fa3 in hopper
diffusion
SGLang Diffusion
run-ci
#20576
opened Mar 14, 2026 by
BBuf
Loading…
5 tasks
[CI] Add Nemotron 3 Super 120B CI tests for BF16 and NVFP4
blackwell
SM100/SM120
#20575
opened Mar 14, 2026 by
mmangkad
Loading…
[Benchmark] Add triton do_bench fallback for ROCm in bench_utils
#20572
opened Mar 14, 2026 by
andyluo7
Loading…
feat: emit per-iteration forward pass metrics via ZMQ PUB
#20569
opened Mar 14, 2026 by
ishandhanani
•
Draft
4 tasks done
Drop the mutable default in forward
diffusion
SGLang Diffusion
#20568
opened Mar 14, 2026 by
tejasae-afk
Loading…
[sgl-model-gateway] Add Anthropic Messages API proxy support
model-gateway
#20566
opened Mar 14, 2026 by
skyliulu
Loading…
3 of 5 tasks
fix: torch-native LoRA for multi-adapter case
#20564
opened Mar 14, 2026 by
satyamk7054
Loading…
2 of 5 tasks
Use torch.addmm instead of separate mm and add_ calls for LoRA torch.native
lora
run-ci
#20562
opened Mar 14, 2026 by
satyamk7054
Loading…
3 tasks done
[MLX] Use ContiguousKVCache for all batch sizes to avoid cache merge/extract overhead
dependencies
Pull requests that update a dependency file
jit-kernel
macos
#20561
opened Mar 14, 2026 by
yeahdongcn
•
Draft
5 tasks
[Bugfix] Fix write-through events not processed when scheduler is idle
#20560
opened Mar 14, 2026 by
youngrok-XCENA
Loading…
1 task done
Fix token leak with logprob_start_len=0 in streaming sessions
#20557
opened Mar 13, 2026 by
YazhiGao
Loading…
2 tasks done
feat: Add 'none' reasoning effort to ChatCompletionRequest
#20556
opened Mar 13, 2026 by
Javtor
Loading…
3 tasks done
[Bug Fix] Fix EAGLE3 crash when logprobs requested with spec v2
#20555
opened Mar 13, 2026 by
yilian49
Loading…
1 task done
[WIP] [Spec] 2/N: Suffix Automaton Ref-based Speculative Decoding
documentation
Improvements or additions to documentation
npu
speculative-decoding
[WIP] Support nvfp4 online weight quantization
blackwell
SM100/SM120
documentation
Improvements or additions to documentation
quant
LLM Quantization
#20549
opened Mar 13, 2026 by
wolfcomos
Loading…
5 tasks
[WIP] [PCG] Enable piecewise CUDA graph testing for VLM models
Multi-modal
multi-modal language model
npu
#20548
opened Mar 13, 2026 by
edwingao28
•
Draft
5 tasks
[Feature] Add spec v2 (overlap scheduling) to DFlash speculative decoding support
#20547
opened Mar 13, 2026 by
dcw02
Loading…
5 tasks
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.