Skip to content

Commit 1c50a0f

Browse files
Merge pull request zhaochenyang20#20 from zhaochenyang20/update_doc
Add Acknowledgment To the Team mates
2 parents 9b46df9 + 05c5ccf commit 1c50a0f

1 file changed

Lines changed: 61 additions & 103 deletions

File tree

README.md

Lines changed: 61 additions & 103 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,19 @@
1-
# sglang-diffusion-routing
1+
# SGLang Diffusion Router
22

3-
A lightweight router for SGLang diffusion workers.
3+
A lightweight router for SGLang diffusion workers used in RL systems.
44

5-
It provides worker registration, load balancing, health checking, and request proxying for diffusion generation APIs.
5+
It provides worker registration, load balancing, health checking, refit weights and request proxying for diffusion generation APIs.
66

7-
## Highlights
7+
## API Reference
88

9-
- `least-request` routing by default, with `round-robin` and `random`.
10-
- Background health checks with quarantine after repeated failures.
11-
- Router APIs for worker registration, health inspection, and proxy forwarding.
12-
- `update_weights_from_disk` broadcast to all healthy workers.
9+
- `POST /add_worker`: add worker via query (`?url=`) or JSON body.
10+
- `GET /list_workers`: list registered workers.
11+
- `GET /health`: aggregated router health.
12+
- `GET /health_workers`: per-worker health and active request counts.
13+
- `POST /generate`: forwards to worker `/v1/images/generations`.
14+
- `POST /generate_video`: forwards to worker `/v1/videos`; rejects image-only workers (`T2I`/`I2I`/`TI2I`) with `400`.
15+
- `POST /update_weights_from_disk`: broadcast to all healthy workers.
16+
- `GET|POST|PUT|DELETE /{path}`: catch-all proxy forwarding.
1317

1418
## Installation
1519

@@ -37,8 +41,6 @@ cd ..
3741

3842
## Quick Start
3943

40-
### Start diffusion workers
41-
4244
```bash
4345
# If connect to HuggingFace is not allowed
4446
# You can set the environment variable SGLANG_USE_MODELSCOPE=TRUE
@@ -56,75 +58,15 @@ CUDA_VISIBLE_DEVICES=1 sglang serve \
5658
--num-gpus 1 \
5759
--host 127.0.0.1 \
5860
--port 30002
59-
```
60-
61-
### Start the router
6261

63-
1. Script entry
64-
65-
```bash
6662
sglang-d-router --port 30081 \
6763
--worker-urls http://localhost:30000 http://localhost:30002
6864
```
6965

70-
2. Module entry
71-
72-
```bash
73-
python -m sglang_diffusion_routing --port 30081 \
74-
--worker-urls http://localhost:30000 http://localhost:30002
75-
```
76-
77-
3. Or start empty and add workers later:
78-
79-
```bash
80-
sglang-d-router --port 30081
81-
curl -X POST "http://localhost:30081/add_worker?url=http://localhost:30000"
82-
curl -X POST "http://localhost:30081/add_worker?url=http://localhost:30002"
83-
```
84-
85-
### Test the router
66+
## Demonstrative Examples
8667

87-
```bash
88-
# Check router health
89-
curl http://localhost:30081/health
90-
91-
# List registered workers
92-
curl http://localhost:30081/list_workers
93-
94-
# Image generation request (returns base64-encoded image)
95-
curl -X POST http://localhost:30081/generate \
96-
-H "Content-Type: application/json" \
97-
-d '{
98-
"model": "Qwen/Qwen-Image",
99-
"prompt": "a cute cat",
100-
"num_images": 1,
101-
"response_format": "b64_json"
102-
}'
103-
104-
# Decode and save the image locally
105-
curl -s -X POST http://localhost:30081/generate \
106-
-H "Content-Type: application/json" \
107-
-d '{
108-
"model": "Qwen/Qwen-Image",
109-
"prompt": "a cute cat",
110-
"num_images": 1,
111-
"response_format": "b64_json"
112-
}' | python3 -c "
113-
import sys, json, base64
114-
resp = json.load(sys.stdin)
115-
img = base64.b64decode(resp['data'][0]['b64_json'])
116-
with open('output.png', 'wb') as f:
117-
f.write(img)
118-
print('Saved to output.png')
119-
"
12068

121-
122-
curl -X POST http://localhost:30081/update_weights_from_disk \
123-
-H "Content-Type: application/json" \
124-
-d '{"model_path": "Qwen/Qwen-Image-2512"}'
125-
```
126-
127-
### Python requests examples
69+
### With Python Requests
12870

12971
```python
13072
import requests
@@ -166,44 +108,60 @@ print(resp.json())
166108
# Check per-worker health and load
167109
resp = requests.get(f"{ROUTER}/health_workers")
168110
print(resp.json())
111+
112+
# Update weights from disk
113+
resp = requests.post(f"{ROUTER}/update_weights_from_disk", json={
114+
"model_path": "Qwen/Qwen-Image-2512",
115+
})
116+
print(resp.json())
169117
```
170118

171-
## Router API
119+
### With Curl
172120

173-
- `POST /add_worker`: add worker via query (`?url=`) or JSON body.
174-
- `GET /list_workers`: list registered workers.
175-
- `GET /health`: aggregated router health.
176-
- `GET /health_workers`: per-worker health and active request counts.
177-
- `POST /generate`: forwards to worker `/v1/images/generations`.
178-
- `POST /generate_video`: forwards to worker `/v1/videos`; rejects image-only workers (`T2I`/`I2I`/`TI2I`) with `400`.
179-
- `POST /update_weights_from_disk`: broadcast to healthy workers.
180-
- `GET|POST|PUT|DELETE /{path}`: catch-all proxy forwarding.
181-
- `POST /update_weights_from_disk`: broadcast to all healthy workers.
121+
```bash
122+
# Check router health
123+
curl http://localhost:30081/health
124+
125+
# List registered workers
126+
curl http://localhost:30081/list_workers
127+
128+
# Image generation request (returns base64-encoded image)
129+
curl -X POST http://localhost:30081/generate \
130+
-H "Content-Type: application/json" \
131+
-d '{
132+
"model": "Qwen/Qwen-Image",
133+
"prompt": "a cute cat",
134+
"num_images": 1,
135+
"response_format": "b64_json"
136+
}'
182137

183-
## Project Layout
184-
185-
```text
186-
.
187-
├── docs/
188-
│ └── update_weights_from_disk.md
189-
├── src/sglang_diffusion_routing/
190-
│ ├── cli/
191-
│ └── router/
192-
├── tests/
193-
│ ├── benchmarks/
194-
│ │ └── diffusion_router/
195-
│ │ ├── bench_router.py
196-
│ │ └── bench_routing_algorithms.py
197-
│ └── unit/
198-
├── pyproject.toml
199-
└── README.md
138+
# Decode and save the image locally
139+
curl -s -X POST http://localhost:30081/generate \
140+
-H "Content-Type: application/json" \
141+
-d '{
142+
"model": "Qwen/Qwen-Image",
143+
"prompt": "a cute cat",
144+
"num_images": 1,
145+
"response_format": "b64_json"
146+
}' | python3 -c "
147+
import sys, json, base64
148+
resp = json.load(sys.stdin)
149+
img = base64.b64decode(resp['data'][0]['b64_json'])
150+
with open('output.png', 'wb') as f:
151+
f.write(img)
152+
print('Saved to output.png')
153+
"
154+
155+
156+
curl -X POST http://localhost:30081/update_weights_from_disk \
157+
-H "Content-Type: application/json" \
158+
-d '{"model_path": "Qwen/Qwen-Image-2512"}'
200159
```
201160

202161
## Acknowledgment
203162

204-
This project is derived from [radixark/miles#544](https://github.com/radixark/miles/pull/544). Thanks to the original authors for their work.
163+
This project is derived from [radixark/miles#544](https://github.com/radixark/miles/pull/544). Thanks to the original authors.
205164

206-
## Notes
165+
SGLang Diffusion RL team is responsible for the development and maintenance of this project. Our team mates in alphabetical order:
207166

208-
- Quarantined workers are intentionally not auto-reintroduced.
209-
- Router responses are fully buffered; streaming passthrough is not implemented.
167+
Banghua Zhu, Chengliang Qian, Chenyang Zhao, Fenglin Yu, Hao Jin, Huapeng Zhou, Jiajun Li, Kangrui Du, Kun Lin, Mao Cheng, Mengyang Liu, Qiujiang Chen, Shenggui Li, Shirui Chen, Shuwen Wang, Xi Chen, Xiaole Guo, Ying Sheng, Yueming Yuan, Yuhao Yang, Yusheng Su, Zhiheng Ye

0 commit comments

Comments
 (0)