@@ -4,13 +4,38 @@ A lightweight router for SGLang diffusion workers.
44
55It provides worker registration, load balancing, health checking, and request proxying for diffusion generation APIs.
66
7+ ---
8+
9+ ## Table of Contents
10+
11+ - [ Highlights] ( #highlights )
12+ - [ Installation] ( #installation )
13+ - [ Quick Start] ( #quick-start )
14+ - [ Start diffusion workers] ( #start-diffusion-workers )
15+ - [ Start the router] ( #start-the-router )
16+ - [ Test the router] ( #test-the-router )
17+ - [ Router API] ( #router-api )
18+ - [ Inference Endpoints] ( #inference-endpoints )
19+ - [ Videos Result Query] ( #videos-result-query )
20+ - [ Model Discovery and Health Checks] ( #model-discovery-and-health-checks )
21+ - [ Worker Management APIs] ( #worker-management-apis )
22+ - [ Optional (business-dependent)] ( #optional-business-dependent )
23+ - [ ` update_weights_from_disk ` behavior] ( #update_weights_from_disk-behavior )
24+ - [ Benchmark Scripts] ( #benchmark-scripts )
25+ - [ Acknowledgment] ( #acknowledgment )
26+ - [ Notes] ( #notes )
27+
28+ ---
29+
730## Highlights
831
932- ` least-request ` routing by default, with ` round-robin ` and ` random ` .
1033- Background health checks with quarantine after repeated failures.
1134- Router APIs for worker registration, health inspection, and proxy forwarding.
1235- ` update_weights_from_disk ` broadcast to all healthy workers.
1336
37+ ---
38+
1439## Installation
1540
1641From repository root:
@@ -31,6 +56,8 @@ Workers require SGLang diffusion support:
3156uv pip install " sglang[diffusion]" --prerelease=allow
3257```
3358
59+ ---
60+
3461## Quick Start
3562
3663### Start diffusion workers
@@ -122,16 +149,48 @@ curl -X POST http://localhost:30081/generate_video \
122149curl http://localhost:30081/health_workers
123150```
124151
152+ ---
153+
125154## Router API
126155
127- - ` POST /add_worker ` : add worker via query (` ?url= ` ) or JSON body.
128- - ` GET /list_workers ` : list registered workers.
129- - ` GET /health ` : aggregated router health.
130- - ` GET /health_workers ` : per-worker health and active request counts.
131- - ` POST /generate ` : forwards to worker ` /v1/images/generations ` .
132- - ` POST /generate_video ` : forwards to worker ` /v1/videos ` .
133- - ` POST /update_weights_from_disk ` : broadcast to healthy workers.
134- - ` GET|POST|PUT|DELETE /{path} ` : catch-all proxy forwarding.
156+ ### Inference Endpoints
157+
158+ | Method | Path | Description |
159+ | ---| ---| ---|
160+ | ` POST ` | ` /v1/images/generations ` | Entrypoint for text-to-image generation |
161+ | ` POST ` | ` /v1/videos ` | Entrypoint for text-to-video generation |
162+
163+ ### Videos Result Query
164+
165+ | Method | Path | Description |
166+ | ---| ---| ---|
167+ | ` GET ` | ` /v1/videos ` | List or poll video jobs |
168+ | ` GET ` | ` /v1/videos/{video_id} ` | Get status/details of a single video job |
169+ | ` GET ` | ` /v1/videos/{video_id}/content ` | Download generated video content |
170+
171+ ### Model Discovery and Health Checks
172+
173+ | Method | Path | Description |
174+ | ---| ---| ---|
175+ | ` GET ` | ` /v1/models ` | OpenAI-style model discovery |
176+ | ` GET ` | ` /health ` | Basic health probe |
177+
178+ ### Worker Management APIs
179+
180+ | Method | Path | Description |
181+ | ---| ---| ---|
182+ | ` POST ` | ` /workers ` | Register a worker |
183+ | ` GET ` | ` /workers ` | List workers (including health/load) |
184+ | ` GET ` | ` /workers/{worker_id} ` | Get worker details |
185+ | ` PUT ` | ` /workers/{worker_id} ` | Update worker configuration |
186+ | ` DELETE ` | ` /workers/{worker_id} ` | Deregister a worker |
187+
188+ ### Optional (business-dependent)
189+
190+ | Method | Path | Description |
191+ | ---| ---| ---|
192+ | ` POST ` | ` /update_weights_from_disk ` | Reload weights from disk (ops/admin use) |
193+
135194
136195## ` update_weights_from_disk ` behavior
137196
@@ -165,6 +224,8 @@ Response shape:
165224}
166225```
167226
227+ ---
228+
168229## Benchmark Scripts
169230
170231Benchmark scripts are available under ` tests/benchmarks/diffusion_router/ ` and are intended for manual runs.
@@ -190,29 +251,14 @@ SGLANG_USE_MODELSCOPE=TRUE python tests/benchmarks/diffusion_router/bench_routin
190251 --max-concurrency 4
191252```
192253
193- ## Project Layout
194-
195- ``` text
196- .
197- ├── docs/
198- │ └── update_weights_from_disk.md
199- ├── src/sglang_diffusion_routing/
200- │ ├── cli/
201- │ └── router/
202- ├── tests/
203- │ ├── benchmarks/
204- │ │ └── diffusion_router/
205- │ │ ├── bench_router.py
206- │ │ └── bench_routing_algorithms.py
207- │ └── unit/
208- ├── pyproject.toml
209- └── README.md
210- ```
254+ ---
211255
212256## Acknowledgment
213257
214258This project is derived from [ radixark/miles #544 ] ( https://github.com/radixark/miles/pull/544 ) . Thanks to the original authors for their work.
215259
260+ ---
261+
216262## Notes
217263
218264- Quarantined workers are intentionally not auto-reintroduced.
0 commit comments