1- # sglang-diffusion-routing
1+ # SGLang Diffusion Router
22
3- A lightweight router for SGLang diffusion workers.
3+ A lightweight router for SGLang diffusion workers used in RL systems .
44
5- It provides worker registration, load balancing, health checking, and request proxying for diffusion generation APIs.
5+ It provides worker registration, load balancing, health checking, refit weights and request proxying for diffusion generation APIs.
66
7- ## Highlights
7+ ## API Reference
88
9- - ` least-request ` routing by default, with ` round-robin ` and ` random ` .
10- - Background health checks with quarantine after repeated failures.
11- - Router APIs for worker registration, health inspection, and proxy forwarding.
12- - ` update_weights_from_disk ` broadcast to all healthy workers.
9+ - ` POST /add_worker ` : add worker via query (` ?url= ` ) or JSON body.
10+ - ` GET /list_workers ` : list registered workers.
11+ - ` GET /health ` : aggregated router health.
12+ - ` GET /health_workers ` : per-worker health and active request counts.
13+ - ` POST /generate ` : forwards to worker ` /v1/images/generations ` .
14+ - ` POST /generate_video ` : forwards to worker ` /v1/videos ` ; rejects image-only workers (` T2I ` /` I2I ` /` TI2I ` ) with ` 400 ` .
15+ - ` POST /update_weights_from_disk ` : broadcast to all healthy workers.
16+ - ` GET|POST|PUT|DELETE /{path} ` : catch-all proxy forwarding.
1317
1418## Installation
1519
3741
3842## Quick Start
3943
40- ### Start diffusion workers
41-
4244``` bash
4345# If connect to HuggingFace is not allowed
4446# You can set the environment variable SGLANG_USE_MODELSCOPE=TRUE
@@ -56,75 +58,15 @@ CUDA_VISIBLE_DEVICES=1 sglang serve \
5658 --num-gpus 1 \
5759 --host 127.0.0.1 \
5860 --port 30002
59- ```
60-
61- ### Start the router
6261
63- 1 . Script entry
64-
65- ``` bash
6662sglang-d-router --port 30081 \
6763 --worker-urls http://localhost:30000 http://localhost:30002
6864```
6965
70- 2 . Module entry
71-
72- ``` bash
73- python -m sglang_diffusion_routing --port 30081 \
74- --worker-urls http://localhost:30000 http://localhost:30002
75- ```
76-
77- 3 . Or start empty and add workers later:
78-
79- ``` bash
80- sglang-d-router --port 30081
81- curl -X POST " http://localhost:30081/add_worker?url=http://localhost:30000"
82- curl -X POST " http://localhost:30081/add_worker?url=http://localhost:30002"
83- ```
84-
85- ### Test the router
66+ ## Demonstrative Examples
8667
87- ``` bash
88- # Check router health
89- curl http://localhost:30081/health
90-
91- # List registered workers
92- curl http://localhost:30081/list_workers
93-
94- # Image generation request (returns base64-encoded image)
95- curl -X POST http://localhost:30081/generate \
96- -H " Content-Type: application/json" \
97- -d ' {
98- "model": "Qwen/Qwen-Image",
99- "prompt": "a cute cat",
100- "num_images": 1,
101- "response_format": "b64_json"
102- }'
103-
104- # Decode and save the image locally
105- curl -s -X POST http://localhost:30081/generate \
106- -H " Content-Type: application/json" \
107- -d ' {
108- "model": "Qwen/Qwen-Image",
109- "prompt": "a cute cat",
110- "num_images": 1,
111- "response_format": "b64_json"
112- }' | python3 -c "
113- import sys, json, base64
114- resp = json.load(sys.stdin)
115- img = base64.b64decode(resp['data'][0]['b64_json'])
116- with open('output.png', 'wb') as f:
117- f.write(img)
118- print('Saved to output.png')
119- "
12068
121-
122- curl -X POST http://localhost:30081/update_weights_from_disk \
123- -H " Content-Type: application/json" \
124- -d ' {"model_path": "Qwen/Qwen-Image-2512"}'
125- ```
126-
127- ### Python requests examples
69+ ### With Python Requests
12870
12971``` python
13072import requests
@@ -166,44 +108,60 @@ print(resp.json())
166108# Check per-worker health and load
167109resp = requests.get(f " { ROUTER } /health_workers " )
168110print (resp.json())
111+
112+ # Update weights from disk
113+ resp = requests.post(f " { ROUTER } /update_weights_from_disk " , json = {
114+ " model_path" : " Qwen/Qwen-Image-2512" ,
115+ })
116+ print (resp.json())
169117```
170118
171- ## Router API
119+ ### With Curl
172120
173- - ` POST /add_worker ` : add worker via query (` ?url= ` ) or JSON body.
174- - ` GET /list_workers ` : list registered workers.
175- - ` GET /health ` : aggregated router health.
176- - ` GET /health_workers ` : per-worker health and active request counts.
177- - ` POST /generate ` : forwards to worker ` /v1/images/generations ` .
178- - ` POST /generate_video ` : forwards to worker ` /v1/videos ` ; rejects image-only workers (` T2I ` /` I2I ` /` TI2I ` ) with ` 400 ` .
179- - ` POST /update_weights_from_disk ` : broadcast to healthy workers.
180- - ` GET|POST|PUT|DELETE /{path} ` : catch-all proxy forwarding.
181- - ` POST /update_weights_from_disk ` : broadcast to all healthy workers.
121+ ``` bash
122+ # Check router health
123+ curl http://localhost:30081/health
124+
125+ # List registered workers
126+ curl http://localhost:30081/list_workers
127+
128+ # Image generation request (returns base64-encoded image)
129+ curl -X POST http://localhost:30081/generate \
130+ -H " Content-Type: application/json" \
131+ -d ' {
132+ "model": "Qwen/Qwen-Image",
133+ "prompt": "a cute cat",
134+ "num_images": 1,
135+ "response_format": "b64_json"
136+ }'
182137
183- ## Project Layout
184-
185- ``` text
186- .
187- ├── docs/
188- │ └── update_weights_from_disk.md
189- ├── src/sglang_diffusion_routing/
190- │ ├── cli/
191- │ └── router/
192- ├── tests/
193- │ ├── benchmarks/
194- │ │ └── diffusion_router/
195- │ │ ├── bench_router.py
196- │ │ └── bench_routing_algorithms.py
197- │ └── unit/
198- ├── pyproject.toml
199- └── README.md
138+ # Decode and save the image locally
139+ curl -s -X POST http://localhost:30081/generate \
140+ -H " Content-Type: application/json" \
141+ -d ' {
142+ "model": "Qwen/Qwen-Image",
143+ "prompt": "a cute cat",
144+ "num_images": 1,
145+ "response_format": "b64_json"
146+ }' | python3 -c "
147+ import sys, json, base64
148+ resp = json.load(sys.stdin)
149+ img = base64.b64decode(resp['data'][0]['b64_json'])
150+ with open('output.png', 'wb') as f:
151+ f.write(img)
152+ print('Saved to output.png')
153+ "
154+
155+
156+ curl -X POST http://localhost:30081/update_weights_from_disk \
157+ -H " Content-Type: application/json" \
158+ -d ' {"model_path": "Qwen/Qwen-Image-2512"}'
200159```
201160
202161## Acknowledgment
203162
204- This project is derived from [ radixark/miles #544 ] ( https://github.com/radixark/miles/pull/544 ) . Thanks to the original authors for their work .
163+ This project is derived from [ radixark/miles #544 ] ( https://github.com/radixark/miles/pull/544 ) . Thanks to the original authors.
205164
206- ## Notes
165+ SGLang Diffusion RL team is responsible for the development and maintenance of this project. Our team mates in alphabetical order:
207166
208- - Quarantined workers are intentionally not auto-reintroduced.
209- - Router responses are fully buffered; streaming passthrough is not implemented.
167+ Banghua Zhu, Chengliang Qian, Chenyang Zhao, Fenglin Yu, Hao Jin, Huapeng Zhou, Jiajun Li, Kangrui Du, Kun Lin, Mao Cheng, Mengyang Liu, Qiujiang Chen, Shenggui Li, Shirui Chen, Shuwen Wang, Xi Chen, Xiaole Guo, Ying Sheng, Yueming Yuan, Yuhao Yang, Yusheng Su, Zhiheng Ye
0 commit comments