Skip to content

Commit bcc37e2

Browse files
Feature: Init diffusion router with basic end points
2 parents e44c1c1 + f02fc8a commit bcc37e2

14 files changed

Lines changed: 2108 additions & 8 deletions

File tree

README.md

Lines changed: 205 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,210 @@
11
# sglang-diffusion-routing
22

3-
A demonstrative example of running SGLang Diffusion with a DP router, which supports `generation` (a lot of methods, including [SDE/CPS](https://github.com/sgl-project/sglang/pull/18806)), `update_weights_from_disk` in PR [18306](https://github.com/sgl-project/sglang/pull/18306), and `health_check`.
3+
A lightweight router for SGLang diffusion workers.
44

5-
1. Copy all the codes of https://github.com/radixark/miles/pull/544 to here with sincere acknowledgment.
6-
2. Write up a detailed README on how to use SGLang Diffusion Router to launch multiple instances and send requests.
5+
It provides worker registration, load balancing, health checking, and request proxying for diffusion generation APIs.
76

8-
For example, given that we can make a Python binding of the sglang-d router:
7+
## Highlights
98

10-
1. pip install sglang-d-router (Only for local development right now, clone the repository and run `pip install .` from the root directory. No need to make a PyPi)
11-
2. pip install "sglang[diffusion]"
12-
3. launching command (how to use sglang-d-router to launch n sglang diffusion servers)
13-
4. Sending demonstrative requests
9+
- `least-request` routing by default, with `round-robin` and `random`.
10+
- Background health checks with quarantine after repeated failures.
11+
- Router APIs for worker registration, health inspection, and proxy forwarding.
12+
- `update_weights_from_disk` broadcast to all healthy workers.
13+
14+
## Installation
15+
16+
From repository root:
17+
18+
```bash
19+
python3 -m venv .venv
20+
. .venv/bin/activate
21+
pip install .
22+
```
23+
24+
Development install:
25+
26+
```bash
27+
pip install -e .
28+
```
29+
30+
Run tests:
31+
32+
```bash
33+
pip install pytest
34+
pytest tests/unit -v
35+
```
36+
37+
Workers require SGLang diffusion support:
38+
39+
```bash
40+
pip install "sglang[diffusion]"
41+
```
42+
43+
## Quick Start
44+
45+
### 1) Start diffusion workers
46+
47+
```bash
48+
# worker 1
49+
CUDA_VISIBLE_DEVICES=0 sglang serve \
50+
--model-path stabilityai/stable-diffusion-3-medium-diffusers \
51+
--num-gpus 1 \
52+
--host 127.0.0.1 \
53+
--port 30000
54+
55+
# worker 2
56+
CUDA_VISIBLE_DEVICES=1 sglang serve \
57+
--model-path stabilityai/stable-diffusion-3-medium-diffusers \
58+
--num-gpus 1 \
59+
--host 127.0.0.1 \
60+
--port 30001
61+
```
62+
63+
### 2) Start the router
64+
65+
Script entry:
66+
67+
```bash
68+
sglang-d-router --port 30080 \
69+
--worker-urls http://localhost:30000 http://localhost:30001
70+
```
71+
72+
Module entry:
73+
74+
```bash
75+
python -m sglang_diffusion_routing --port 30080 \
76+
--worker-urls http://localhost:30000 http://localhost:30001
77+
```
78+
79+
Or start empty and add workers later:
80+
81+
```bash
82+
sglang-d-router --port 30080
83+
curl -X POST "http://localhost:30080/add_worker?url=http://localhost:30000"
84+
```
85+
86+
### 3) Test the router
87+
88+
```bash
89+
# Check router health
90+
curl http://localhost:30080/health
91+
92+
# List registered workers
93+
curl http://localhost:30080/list_workers
94+
95+
# Image generation request (SD3)
96+
curl -X POST http://localhost:30080/generate \
97+
-H "Content-Type: application/json" \
98+
-d '{
99+
"model": "stabilityai/stable-diffusion-3-medium-diffusers",
100+
"prompt": "a cute cat",
101+
"num_images": 1
102+
}'
103+
104+
# Video generation request
105+
curl -X POST http://localhost:30080/generate_video \
106+
-H "Content-Type: application/json" \
107+
-d '{
108+
"model": "stabilityai/stable-video-diffusion",
109+
"prompt": "a flowing river"
110+
}'
111+
112+
# Check per-worker health and load
113+
curl http://localhost:30080/health_workers
114+
```
115+
116+
## Router API
117+
118+
- `POST /add_worker`: add worker via query (`?url=`) or JSON body.
119+
- `GET /list_workers`: list registered workers.
120+
- `GET /health`: aggregated router health.
121+
- `GET /health_workers`: per-worker health and active request counts.
122+
- `POST /generate`: forwards to worker `/v1/images/generations`.
123+
- `POST /generate_video`: forwards to worker `/v1/videos`.
124+
- `POST /update_weights_from_disk`: broadcast to healthy workers.
125+
- `GET|POST|PUT|DELETE /{path}`: catch-all proxy forwarding.
126+
127+
## `update_weights_from_disk` behavior
128+
129+
Full details: [docs/update_weights_from_disk.md](docs/update_weights_from_disk.md)
130+
131+
- The router forwards request payloads as-is to each healthy worker.
132+
- The router does not validate payload schema; payload semantics are worker-defined.
133+
- Worker servers must implement `POST /update_weights_from_disk`.
134+
135+
Example:
136+
137+
```bash
138+
curl -X POST http://localhost:30080/update_weights_from_disk \
139+
-H "Content-Type: application/json" \
140+
-d '{"model_path": "/path/to/new/weights"}'
141+
```
142+
143+
Response shape:
144+
145+
```json
146+
{
147+
"results": [
148+
{
149+
"worker_url": "http://localhost:30000",
150+
"status_code": 200,
151+
"body": {
152+
"ok": true
153+
}
154+
}
155+
]
156+
}
157+
```
158+
159+
## Benchmark Scripts
160+
161+
Benchmark scripts are available under `tests/benchmarks/diffusion_router/` and are intended for manual runs.
162+
They are not part of default unit test collection (`pytest tests/unit -v`).
163+
164+
Single benchmark:
165+
166+
```bash
167+
python tests/benchmarks/diffusion_router/bench_router.py \
168+
--model Wan-AI/Wan2.2-T2V-A14B-Diffusers \
169+
--num-workers 2 \
170+
--num-prompts 20 \
171+
--max-concurrency 4
172+
```
173+
174+
Algorithm comparison:
175+
176+
```bash
177+
python tests/benchmarks/diffusion_router/bench_routing_algorithms.py \
178+
--model Wan-AI/Wan2.2-T2V-A14B-Diffusers \
179+
--num-workers 2 \
180+
--num-prompts 20 \
181+
--max-concurrency 4
182+
```
183+
184+
## Project Layout
185+
186+
```text
187+
.
188+
├── docs/
189+
│ └── update_weights_from_disk.md
190+
├── src/sglang_diffusion_routing/
191+
│ ├── cli/
192+
│ └── router/
193+
├── tests/
194+
│ ├── benchmarks/
195+
│ │ └── diffusion_router/
196+
│ │ ├── bench_router.py
197+
│ │ └── bench_routing_algorithms.py
198+
│ └── unit/
199+
├── pyproject.toml
200+
└── README.md
201+
```
202+
203+
## Acknowledgment
204+
205+
This project is derived from [radixark/miles#544](https://github.com/radixark/miles/pull/544). Thanks to the original authors for their work.
206+
207+
## Notes
208+
209+
- Quarantined workers are intentionally not auto-reintroduced.
210+
- Router responses are fully buffered; streaming passthrough is not implemented.

docs/update_weights_from_disk.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# update_weights_from_disk
2+
3+
This document describes `POST /update_weights_from_disk` behavior in this repository.
4+
5+
## Router behavior
6+
7+
The router does not validate or transform payload fields.
8+
It forwards the original request body to every healthy worker and returns per-worker results.
9+
10+
Payload semantics are therefore defined by the worker implementation, not by the router.
11+
12+
## Requirements
13+
14+
- Worker servers must implement `POST /update_weights_from_disk`.
15+
- For SGLang workers, use a version that includes this endpoint.
16+
- Weights must match your worker runtime expectations.
17+
18+
## Basic example
19+
20+
```bash
21+
curl -X POST http://localhost:30080/update_weights_from_disk \
22+
-H "Content-Type: application/json" \
23+
-d '{"model_path": "/path/to/new/weights"}'
24+
```
25+
26+
## Optional fields
27+
28+
Some worker versions support optional fields such as `target_modules`:
29+
30+
```bash
31+
curl -X POST http://localhost:30080/update_weights_from_disk \
32+
-H "Content-Type: application/json" \
33+
-d '{"model_path": "/path/to/weights", "target_modules": ["transformer", "vae"]}'
34+
```
35+
36+
If your worker version does not support extra fields, failure is returned by the worker side.
37+
38+
## Response shape
39+
40+
The router response includes one item per healthy worker:
41+
42+
```json
43+
{
44+
"results": [
45+
{
46+
"worker_url": "http://localhost:10090",
47+
"status_code": 200,
48+
"body": {
49+
"ok": true
50+
}
51+
},
52+
{
53+
"worker_url": "http://localhost:10092",
54+
"status_code": 500,
55+
"body": {
56+
"error": "worker-side failure"
57+
}
58+
}
59+
]
60+
}
61+
```
62+
63+
Notes:
64+
- Quarantined workers are excluded from broadcast.
65+
- Transport/runtime exceptions are surfaced as per-worker `status_code=502`.

pyproject.toml

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
[build-system]
2+
requires = ["setuptools>=68", "wheel"]
3+
build-backend = "setuptools.build_meta"
4+
5+
[project]
6+
name = "sglang-diffusion-routing"
7+
version = "0.1.0"
8+
description = "Load-balancing router for SGLang diffusion workers"
9+
readme = "README.md"
10+
requires-python = ">=3.10"
11+
license = { text = "MIT" }
12+
dependencies = [
13+
"fastapi>=0.110",
14+
"httpx>=0.27",
15+
"uvicorn>=0.30",
16+
]
17+
classifiers = [
18+
"License :: OSI Approved :: MIT License",
19+
"Programming Language :: Python :: 3",
20+
"Programming Language :: Python :: 3 :: Only",
21+
"Programming Language :: Python :: 3.10",
22+
"Programming Language :: Python :: 3.11",
23+
"Intended Audience :: Developers",
24+
]
25+
26+
[project.scripts]
27+
sglang-d-router = "sglang_diffusion_routing.cli.main:main"
28+
29+
[tool.setuptools]
30+
package-dir = { "" = "src" }
31+
32+
[tool.setuptools.packages.find]
33+
where = ["src"]
34+
35+
[tool.pytest.ini_options]
36+
testpaths = ["tests/unit"]
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
"""Public package API for sglang diffusion routing."""
2+
3+
from sglang_diffusion_routing.router.diffusion_router import DiffusionRouter
4+
5+
__all__ = ["DiffusionRouter"]
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
from sglang_diffusion_routing.cli.main import main
2+
3+
if __name__ == "__main__":
4+
raise SystemExit(main())
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
"""CLI package for sglang diffusion routing."""

0 commit comments

Comments
 (0)