Skip to content

Add sleep/wake support for diffusion engine#22659

Draft
MikukuOvO wants to merge 66 commits intosgl-project:mainfrom
MikukuOvO:dev/pr-19152-sleep-wake-rebased
Draft

Add sleep/wake support for diffusion engine#22659
MikukuOvO wants to merge 66 commits intosgl-project:mainfrom
MikukuOvO:dev/pr-19152-sleep-wake-rebased

Conversation

@MikukuOvO
Copy link
Copy Markdown
Contributor

Motivation

This PR is inherited from and rebased on top of #19152

It adds coarse-grained sleep/wake support for the multimodal diffusion engine so the server can temporarily release GPU memory without restarting the process.

Modifications

  • Add /release_memory_occupation and /resume_memory_occupation for diffusion.
  • Add scheduler/worker support to move active pipeline modules between GPU and CPU.
  • Reject generation requests while the server is sleeping.
  • Add tests and documentation for the sleep/wake flow.

Accuracy Tests

This PR does not change model math or kernel behavior.

Functional checks were run for:

  • Qwen/Qwen-Image on 1 GPU
  • Qwen/Qwen-Image with TP=2
  • Tongyi-MAI/Z-Image
  • Tongyi-MAI/Z-Image-Turbo

Verified behavior:

  • sleep releases GPU memory
  • generation while sleeping returns 400
  • wake restores service and generation works again

Speed Tests and Profiling

This PR is not intended as an inference speed optimization.

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the [PR Merge Process](https://github.com/sgl-project/sglang/blob/main/.github/
    MAINTAINER.md#pull-request-merge-process).
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or
    contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions bot added documentation Improvements or additions to documentation diffusion SGLang Diffusion labels Apr 13, 2026
@MikukuOvO MikukuOvO marked this pull request as draft April 13, 2026 05:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

diffusion SGLang Diffusion documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants