Commit 6c763c1
Utils refactor (#1)
* fix(ci): recover from corrupted MMMU parquet cache (sgl-project#17256)
* [diffusion] feat: support default 4-step inference for Flux2-Klein distilled models (sgl-project#17225)
Signed-off-by: Lancer <maruixiang6688@gmail.com>
* Add runner utilization report workflow (sgl-project#17234)
* cli: support sglang version (sgl-project#17250)
* Use swa radix cache and memory pool for gpt-oss model (sgl-project#17261)
* [VLM][Reland] Refactor load_mm_data to improve performance (sgl-project#16152)
Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>
* [Tiny] Improve docs (sgl-project#17264)
* [diffusion] fix: set guidance_scale default to None (sgl-project#17182)
* Tiny fix comment typo (sgl-project#17287)
* [SPEC_V2] Enable cudagraph draft_extend for trtllm_mla_backend and Acclen Fix for DP under cudagraph mode (sgl-project#16974)
* Add kl test for swa radix cache (sgl-project#17281)
* fix: Handle multiple named chat templates in HuggingFace tokenizers (sgl-project#17236)
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
* Move radix cache related tests (sgl-project#17295)
* [Refactor] Add `-fp4-gemm-backend` to replace `SGLANG_FLASHINFER_FP4_GEMM_BACKEND` (sgl-project#16534)
Co-authored-by: Vincent Zhong <207368749+vincentzed@users.noreply.github.com>
* [Bugfix] Fix PD accuracy when MTP is not configured on the prefill node (sgl-project#17212)
Co-authored-by: Shangming Cai <csmthu@gmail.com>
* [Diffusion] Apply jit qk_norm to flux1 (sgl-project#17296)
* [Refactor] Split out deepseek v2 weight loader function into mixin (sgl-project#16649)
* [NPU]Support GPT-OSS for NPU (sgl-project#14197)
* [jit-kernel] Add CuTe DSL GDN Decode Kernel (sgl-project#15631)
Co-authored-by: Jinyan Chen <jinyanc@nvidia.com>
* [GLM 4.7] Add RTX 6000 Pro aka sm120 (sgl-project#17235)
Co-authored-by: root <root@ubuntu-nvidia.localdomain>
* Update CODEOWNERS for multimodal_gen (sgl-project#17308)
Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
* [Feature] overlap LoRA weight loading with compute (sgl-project#15512)
* [PD] Optimize MHA models pp util calculation logic (sgl-project#17306)
* [Minor] Correct sglang version when installing from source (sgl-project#17315)
* Use dsv3 optimized routing `fused_topk_deepseek` instead of `moe_fused_gate` (sgl-project#15347)
* [DeepSeek v3.2] Opt MTP decode cuda batch sizes and nsa implementation (sgl-project#16961)
* Update code sync scripts (sgl-project#17319)
* [Auto Sync] Update tokenizer_manager.py (20260119) (sgl-project#17317)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* support new qwen3_coder_detector (sgl-project#16744)
Co-authored-by: liugaoji.lgj <liugaoji.lgj@alibaba-inc.com>
* Fix kernel selection in biased_grouped_topk_gpu (sgl-project#17325)
* KV Cache Events with Attention DP bug fix (sgl-project#16030) (sgl-project#16412)
* [Perf] fuse q, k norm for Flux2Attention (sgl-project#17241)
Co-authored-by: Minglei Zhu <zminglei@linkedin.com>
* [CI] Add partition to stage-b-test-large-1-gpu (11->12) (sgl-project#17245)
* fix(ci): rate limit and permission errors in trace publishing (sgl-project#17238)
* Revert "[Perf] fuse q, k norm for Flux2Attention (sgl-project#17241)" (sgl-project#17332)
* Migrate performance, accuracy, and quantization tests to CI registry (sgl-project#17177)
Co-authored-by: Kangyan-Zhou <zky314343421@gmail.com>
* Inclusion of nvfp4 blockscale in EPLB Rebalance (sgl-project#17158)
* [Refactor] Set `fp4-gemm-backend=auto` on SM100 and rename `fp4-gemm-backend` with `flashinfer_` prefix (sgl-project#17309)
* [Diffusion] Apply qknorm to flux2 and apply lightx2v rms_norm_one_pass kernel(without residual) (sgl-project#17305)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* Fix v32 continue_final_message not work (sgl-project#16567)
* Evict swa kv cache during decoding (sgl-project#17220)
* [RadixTree][1/N Refactor]: Support unified match_prefix params (sgl-project#17142)
Co-authored-by: yizhang2077 <1109276519@qq.com>
Co-authored-by: pansicheng <sicheng.pan.chn@gmail.com>
* [AMD CI] Migrate and Add More Testcases (sgl-project#17116)
Co-authored-by: yctseng0211 <yctseng@amd.com>
* [AMD] CI - add partitions for stage-b-test-small-1-gpu-amd (sgl-project#17345)
* Restore deepseek_v2.py to main's code, except the utils
* Ran `pre-commit`
---------
Signed-off-by: Lancer <maruixiang6688@gmail.com>
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: Hudson Xing <1277646412@qq.com>
Co-authored-by: Lancer <402430575@qq.com>
Co-authored-by: Alison Shao <54658187+alisonshao@users.noreply.github.com>
Co-authored-by: Mick <mickjagger19@icloud.com>
Co-authored-by: Ke Bao <ispobaoke@gmail.com>
Co-authored-by: Yuan Luo <yuan.luo@hotmail.com>
Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>
Co-authored-by: Mohammad Miadh Angkad <mangkad.bsdsba2027@aim.edu>
Co-authored-by: Changyi Yang <112288487+ChangyiYang@users.noreply.github.com>
Co-authored-by: YAMY <74099316+YAMY1234@users.noreply.github.com>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
Co-authored-by: b8zhong <b8zhong@uwaterloo.ca>
Co-authored-by: Vincent Zhong <207368749+vincentzed@users.noreply.github.com>
Co-authored-by: Ch3ngY1 <91232537+Ch3ngY1@users.noreply.github.com>
Co-authored-by: Shangming Cai <csmthu@gmail.com>
Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
Co-authored-by: Jerry Ji <jerryjilol@gmail.com>
Co-authored-by: Todobe <43903496+Todobe@users.noreply.github.com>
Co-authored-by: Jinyan Chen <93358689+liz-badada@users.noreply.github.com>
Co-authored-by: Jinyan Chen <jinyanc@nvidia.com>
Co-authored-by: Koushik Dutta <koush@koushikdutta.com>
Co-authored-by: root <root@ubuntu-nvidia.localdomain>
Co-authored-by: Glen Liu <62917497+glenliu21@users.noreply.github.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
Co-authored-by: Lee Nau <lnau@nvidia.com>
Co-authored-by: Yongfei Xu <xuyongfei.xyf@antgroup.com>
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Gaoji Liu <34803073+attack204@users.noreply.github.com>
Co-authored-by: liugaoji.lgj <liugaoji.lgj@alibaba-inc.com>
Co-authored-by: yudian0504 <138860534+yudian0504@users.noreply.github.com>
Co-authored-by: Kartik Ramesh <kartikx2000@gmail.com>
Co-authored-by: Minglei Zhu <mingleizhu1122@gmail.com>
Co-authored-by: Minglei Zhu <zminglei@linkedin.com>
Co-authored-by: Kangyan-Zhou <zky314343421@gmail.com>
Co-authored-by: Shu Wang <shuw@nvidia.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: ybyang <10629930+whybeyoung@users.noreply.github.com>
Co-authored-by: zhangheng <hzh0425@apache.org>
Co-authored-by: yizhang2077 <1109276519@qq.com>
Co-authored-by: pansicheng <sicheng.pan.chn@gmail.com>
Co-authored-by: Bingxu Chen <Bingxu.Chen@amd.com>
Co-authored-by: yctseng0211 <yctseng@amd.com>1 parent 294d6ff commit 6c763c1
132 files changed
Lines changed: 7457 additions & 2791 deletions
File tree
- .github
- workflows
- docs
- advanced_features
- get_started
- references
- python
- sglang
- cli
- jit_kernel
- tests
- multimodal_gen
- configs/sample
- runtime
- entrypoints/openai
- layers
- loader
- models/dits
- test/server
- srt
- configs
- disaggregation/common
- entrypoints/openai
- function_call
- hardware_backend/npu/attention
- layers
- attention
- moe
- fused_moe_triton/configs/triton_3_5_1
- quantization
- compressed_tensors/schemes
- lora
- managers
- mem_cache
- storage/lmcache
- model_executor
- models
- deepseek_common
- multimodal/processors
- speculative
- test
- kits
- scripts
- ci
- code_sync
- test
- manual
- registered
- amd
- attention
- core
- eval
- function_call
- hicache
- kernels
- lora
- openai_server/basic
- perf
- quant
- radix_cache
- rl
- rotary
- utils
- srt
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
| 8 | + | |
7 | 9 | | |
8 | 10 | | |
9 | 11 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
27 | 27 | | |
28 | 28 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | | - | |
| 29 | + | |
30 | 30 | | |
31 | 31 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
149 | 149 | | |
150 | 150 | | |
151 | 151 | | |
152 | | - | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
153 | 156 | | |
154 | 157 | | |
155 | 158 | | |
| |||
190 | 193 | | |
191 | 194 | | |
192 | 195 | | |
193 | | - | |
| 196 | + | |
194 | 197 | | |
195 | 198 | | |
196 | 199 | | |
| |||
208 | 211 | | |
209 | 212 | | |
210 | 213 | | |
211 | | - | |
| 214 | + | |
212 | 215 | | |
213 | 216 | | |
214 | 217 | | |
| |||
230 | 233 | | |
231 | 234 | | |
232 | 235 | | |
233 | | - | |
| 236 | + | |
234 | 237 | | |
235 | 238 | | |
236 | 239 | | |
| |||
548 | 551 | | |
549 | 552 | | |
550 | 553 | | |
551 | | - | |
552 | | - | |
553 | | - | |
554 | | - | |
555 | | - | |
556 | | - | |
557 | | - | |
558 | | - | |
559 | | - | |
560 | | - | |
561 | | - | |
562 | | - | |
563 | | - | |
564 | | - | |
565 | | - | |
566 | | - | |
567 | | - | |
568 | | - | |
569 | | - | |
570 | | - | |
571 | | - | |
572 | | - | |
573 | | - | |
574 | | - | |
575 | | - | |
576 | | - | |
577 | | - | |
578 | | - | |
579 | | - | |
580 | | - | |
581 | | - | |
582 | | - | |
583 | | - | |
584 | | - | |
585 | 554 | | |
586 | | - | |
587 | | - | |
588 | | - | |
589 | | - | |
590 | | - | |
591 | | - | |
592 | | - | |
| 555 | + | |
| 556 | + | |
593 | 557 | | |
594 | 558 | | |
595 | 559 | | |
596 | | - | |
| 560 | + | |
597 | 561 | | |
598 | 562 | | |
599 | 563 | | |
| |||
634 | 598 | | |
635 | 599 | | |
636 | 600 | | |
637 | | - | |
| 601 | + | |
638 | 602 | | |
639 | 603 | | |
640 | 604 | | |
| |||
713 | 677 | | |
714 | 678 | | |
715 | 679 | | |
716 | | - | |
717 | | - | |
| 680 | + | |
| 681 | + | |
718 | 682 | | |
719 | 683 | | |
720 | 684 | | |
721 | 685 | | |
722 | | - | |
| 686 | + | |
| 687 | + | |
| 688 | + | |
| 689 | + | |
| 690 | + | |
| 691 | + | |
| 692 | + | |
723 | 693 | | |
724 | 694 | | |
725 | 695 | | |
726 | 696 | | |
727 | | - | |
| 697 | + | |
728 | 698 | | |
729 | 699 | | |
730 | 700 | | |
731 | 701 | | |
732 | | - | |
| 702 | + | |
733 | 703 | | |
734 | 704 | | |
735 | 705 | | |
| |||
768 | 738 | | |
769 | 739 | | |
770 | 740 | | |
771 | | - | |
| 741 | + | |
772 | 742 | | |
773 | 743 | | |
774 | 744 | | |
775 | 745 | | |
776 | | - | |
| 746 | + | |
777 | 747 | | |
778 | 748 | | |
779 | 749 | | |
780 | 750 | | |
781 | | - | |
| 751 | + | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
| 798 | + | |
| 799 | + | |
| 800 | + | |
| 801 | + | |
| 802 | + | |
| 803 | + | |
| 804 | + | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
782 | 816 | | |
783 | 817 | | |
784 | 818 | | |
| |||
822 | 856 | | |
823 | 857 | | |
824 | 858 | | |
825 | | - | |
| 859 | + | |
826 | 860 | | |
827 | 861 | | |
828 | 862 | | |
829 | 863 | | |
830 | | - | |
| 864 | + | |
831 | 865 | | |
832 | 866 | | |
833 | 867 | | |
834 | 868 | | |
835 | | - | |
| 869 | + | |
836 | 870 | | |
837 | 871 | | |
838 | 872 | | |
839 | 873 | | |
840 | | - | |
| 874 | + | |
841 | 875 | | |
842 | 876 | | |
843 | 877 | | |
844 | 878 | | |
845 | | - | |
| 879 | + | |
846 | 880 | | |
847 | 881 | | |
848 | 882 | | |
849 | 883 | | |
850 | | - | |
| 884 | + | |
851 | 885 | | |
852 | 886 | | |
853 | 887 | | |
| |||
886 | 920 | | |
887 | 921 | | |
888 | 922 | | |
889 | | - | |
| 923 | + | |
890 | 924 | | |
891 | 925 | | |
892 | 926 | | |
| |||
926 | 960 | | |
927 | 961 | | |
928 | 962 | | |
929 | | - | |
| 963 | + | |
930 | 964 | | |
931 | 965 | | |
932 | 966 | | |
| |||
942 | 976 | | |
943 | 977 | | |
944 | 978 | | |
945 | | - | |
946 | | - | |
| 979 | + | |
947 | 980 | | |
948 | 981 | | |
949 | 982 | | |
| 983 | + | |
950 | 984 | | |
951 | 985 | | |
952 | 986 | | |
| |||
0 commit comments