[NPU] support mtp(beta) pd disaggregation and dp attention #12443

iforgetmyname · 2025-10-31T07:12:13Z

…raph(npu) & support dsv3_2 mtp

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

…raph(npu) & support dsv3_2 mtp

sglang-bot

upload torch profile to show that the overlap actually happens and there is no CPU overhead.
- Rule: If you change any logic in (overlap) scheduler, attach a torch profile to show the overlap actually happens.
compare the speed and acceptance length of overlap vs. non-overlap
Get someone to verify on GPU
add a test case

sglang-bot

cc @ShangmingCai @hnyls2002

sglang-bot · 2025-11-02T02:43:31Z

python/sglang/srt/speculative/eagle_info_v2.py

+            if _is_npu:
+                device = "npu"
+            else:
+                device = "cuda"


ways to remove if/else

Pu all if else into a single place (ideally in the initialization)

Try to reuse batch.input_ids.device or batch.sampling_info

sglang-bot · 2025-11-02T02:47:18Z

python/sglang/srt/speculative/eagle_worker.py

+            self.cuda_graph_runner_for_draft_extend = (
+                EAGLEDraftExtendCudaGraphRunner(self)
+                if not _is_npu
+                else EAGLEDraftExtendNpuGraphRunner(self)


Device2ExtendCudaGraphRunner = { "npu": EAGLEDraftExtendNpuGraphRunner, "cuda": EAGLEDraftExtendCudaGraphRunner, }

support mtp(beta) pd disaggregation and dp attention & draft extend g…

d43c86c

…raph(npu) & support dsv3_2 mtp

sglang-bot added the run-ci label Oct 31, 2025

Merge branch 'main' into feature/mtp_1

88ed73e

sglang-bot requested changes Nov 2, 2025

View reviewed changes

sglang-bot reviewed Nov 2, 2025

View reviewed changes

sglang-bot requested changes Nov 2, 2025

View reviewed changes

fix comments

58f2de8

liupeng374 force-pushed the feature/mtp_1 branch from c9f6e76 to 58f2de8 Compare November 3, 2025 02:50

Merge branch 'main' into feature/mtp_1

6e0146d

iforgetmyname marked this pull request as ready for review November 3, 2025 10:12

iforgetmyname requested review from BBuf, ByronHsu, Edwardf0t1, HaiShaw, Ying1123, ch-wan, hnyls2002, ispobock, kssteven418, kushanam, merrymercy, ping1jing2, xiezhq-hermann and zhyncs as code owners November 3, 2025 10:12

liupeng374 added 3 commits November 3, 2025 18:12

Merge branch 'main' into feature/mtp_1

b1caf87

Merge branch 'main' into feature/mtp_1

4c31f68

fix comments

a92651c

liupeng374 force-pushed the feature/mtp_1 branch from a4834dd to a92651c Compare November 4, 2025 03:21

liupeng374 added 2 commits November 4, 2025 14:37

Merge branch 'main' into feature/mtp_1

a429ce0

Merge branch 'main' into feature/mtp_1

29272b5

Merge branch 'main' into feature/mtp_1

e59c397

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NPU] support mtp(beta) pd disaggregation and dp attention #12443

[NPU] support mtp(beta) pd disaggregation and dp attention #12443

iforgetmyname commented Oct 31, 2025

Uh oh!

sglang-bot left a comment •

edited

Loading

Uh oh!

sglang-bot left a comment

Uh oh!

sglang-bot Nov 2, 2025

Uh oh!

sglang-bot Nov 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[NPU] support mtp(beta) pd disaggregation and dp attention #12443

Are you sure you want to change the base?

[NPU] support mtp(beta) pd disaggregation and dp attention #12443

Conversation

iforgetmyname commented Oct 31, 2025

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

sglang-bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sglang-bot left a comment

Choose a reason for hiding this comment

Uh oh!

sglang-bot Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

sglang-bot Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sglang-bot left a comment •

edited

Loading