[Release] bump v0.4.7 #7100

zhyncs · 2025-06-11T19:20:09Z

zhyncs
Jun 11, 2025
Maintainer

Highlights

The previously PD disaggregation and large-scale EP functionalities from the blog post have now been fully merged into the latest release.
The blog has been successfully Instruction for Running DeepSeek with Large-scale PD and EP #6017 by over six industry teams, including the TensorRT LLM team.
SGLang’s large-scale EP is now actively used by leading organizations such as Cursor, Qwen, Alimama, Alibaba Cloud, iFlytek, and more. It has been deployed and validated at large scale, running on GPU clusters with thousands of devices.
PD disaggregation and large-scale EP, in addition to supporting DeepSeek V3/R1, now also support Qwen 3 in the latest release.
Full Blackwell support for DeepSeek V3/R1, Llama 4, and Qwen 3. Further optimizations are underway.
SGLang's DeepSeek V3/R1 now achieves 190 TPS on single H200, outperforming other frameworks by over 50%.

We extend our sincere thanks to the following contributors, listed in alphabetical order: Alibaba Cloud, AMD Team, Ant Group, Baseten Team, Cursor Team, Dynamo Team, EAGLE Team, FlashInfer Team, Google Vertex AI Team, iFlytek MaaS Team, Intel Team, LinkedIn Team, Meituan Team, Microsoft Copilot Team, Mooncake Team, NVIDIA Team, Oracle Team, Qwen Team, Voltage Park Team and open source community users. Your support and collaboration are deeply appreciated!

mumbledenoise · 2025-06-18T17:52:04Z

mumbledenoise
Jun 18, 2025

"SGLang's DeepSeek V3/R1 now achieves 190 TPS on single H200, outperforming other frameworks by over 50%."

I haven't been able to determine what this means. Is it:

Using a smaller quantized version of DeepSeek that fits within the 141GB memory of a single H200?
or
Using the full version of DeepSeek, but offloading some of the model that doesn't fit in GPU memory of a single H200 to system RAM?
or
You meant a single 8 x H200 server.

1 reply

zhyncs Jun 18, 2025
Maintainer Author

a single 8 x H200 server

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Release] bump v0.4.7 #7100

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Release] bump v0.4.7 #7100

Uh oh!

Uh oh!

zhyncs Jun 11, 2025 Maintainer

Highlights

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

mumbledenoise Jun 18, 2025

Uh oh!

zhyncs Jun 18, 2025 Maintainer Author

zhyncs
Jun 11, 2025
Maintainer

Replies: 1 comment 1 reply

mumbledenoise
Jun 18, 2025

zhyncs Jun 18, 2025
Maintainer Author