Skip to content

Commit 1a4af52

Browse files
committed
refactor: RESTful and sgl/smg compliant API
1 parent 7e0a78c commit 1a4af52

1 file changed

Lines changed: 72 additions & 26 deletions

File tree

README.md

Lines changed: 72 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,38 @@ A lightweight router for SGLang diffusion workers.
44

55
It provides worker registration, load balancing, health checking, and request proxying for diffusion generation APIs.
66

7+
---
8+
9+
## Table of Contents
10+
11+
- [Highlights](#highlights)
12+
- [Installation](#installation)
13+
- [Quick Start](#quick-start)
14+
- [Start diffusion workers](#start-diffusion-workers)
15+
- [Start the router](#start-the-router)
16+
- [Test the router](#test-the-router)
17+
- [Router API](#router-api)
18+
- [Inference Endpoints](#inference-endpoints)
19+
- [Videos Result Query](#videos-result-query)
20+
- [Model Discovery and Health Checks](#model-discovery-and-health-checks)
21+
- [Worker Management APIs](#worker-management-apis)
22+
- [Optional (business-dependent)](#optional-business-dependent)
23+
- [`update_weights_from_disk` behavior](#update_weights_from_disk-behavior)
24+
- [Benchmark Scripts](#benchmark-scripts)
25+
- [Acknowledgment](#acknowledgment)
26+
- [Notes](#notes)
27+
28+
---
29+
730
## Highlights
831

932
- `least-request` routing by default, with `round-robin` and `random`.
1033
- Background health checks with quarantine after repeated failures.
1134
- Router APIs for worker registration, health inspection, and proxy forwarding.
1235
- `update_weights_from_disk` broadcast to all healthy workers.
1336

37+
---
38+
1439
## Installation
1540

1641
From repository root:
@@ -31,6 +56,8 @@ Workers require SGLang diffusion support:
3156
uv pip install "sglang[diffusion]" --prerelease=allow
3257
```
3358

59+
---
60+
3461
## Quick Start
3562

3663
### Start diffusion workers
@@ -122,16 +149,48 @@ curl -X POST http://localhost:30081/generate_video \
122149
curl http://localhost:30081/health_workers
123150
```
124151

152+
---
153+
125154
## Router API
126155

127-
- `POST /add_worker`: add worker via query (`?url=`) or JSON body.
128-
- `GET /list_workers`: list registered workers.
129-
- `GET /health`: aggregated router health.
130-
- `GET /health_workers`: per-worker health and active request counts.
131-
- `POST /generate`: forwards to worker `/v1/images/generations`.
132-
- `POST /generate_video`: forwards to worker `/v1/videos`.
133-
- `POST /update_weights_from_disk`: broadcast to healthy workers.
134-
- `GET|POST|PUT|DELETE /{path}`: catch-all proxy forwarding.
156+
### Inference Endpoints
157+
158+
| Method | Path | Description |
159+
|---|---|---|
160+
| `POST` | `/v1/images/generations` | Entrypoint for text-to-image generation |
161+
| `POST` | `/v1/videos` | Entrypoint for text-to-video generation |
162+
163+
### Videos Result Query
164+
165+
| Method | Path | Description |
166+
|---|---|---|
167+
| `GET` | `/v1/videos` | List or poll video jobs |
168+
| `GET` | `/v1/videos/{video_id}` | Get status/details of a single video job |
169+
| `GET` | `/v1/videos/{video_id}/content` | Download generated video content |
170+
171+
### Model Discovery and Health Checks
172+
173+
| Method | Path | Description |
174+
|---|---|---|
175+
| `GET` | `/v1/models` | OpenAI-style model discovery |
176+
| `GET` | `/health` | Basic health probe |
177+
178+
### Worker Management APIs
179+
180+
| Method | Path | Description |
181+
|---|---|---|
182+
| `POST` | `/workers` | Register a worker |
183+
| `GET` | `/workers` | List workers (including health/load) |
184+
| `GET` | `/workers/{worker_id}` | Get worker details |
185+
| `PUT` | `/workers/{worker_id}` | Update worker configuration |
186+
| `DELETE` | `/workers/{worker_id}` | Deregister a worker |
187+
188+
### Optional (business-dependent)
189+
190+
| Method | Path | Description |
191+
|---|---|---|
192+
| `POST` | `/update_weights_from_disk` | Reload weights from disk (ops/admin use) |
193+
135194

136195
## `update_weights_from_disk` behavior
137196

@@ -165,6 +224,8 @@ Response shape:
165224
}
166225
```
167226

227+
---
228+
168229
## Benchmark Scripts
169230

170231
Benchmark scripts are available under `tests/benchmarks/diffusion_router/` and are intended for manual runs.
@@ -190,29 +251,14 @@ SGLANG_USE_MODELSCOPE=TRUE python tests/benchmarks/diffusion_router/bench_routin
190251
--max-concurrency 4
191252
```
192253

193-
## Project Layout
194-
195-
```text
196-
.
197-
├── docs/
198-
│ └── update_weights_from_disk.md
199-
├── src/sglang_diffusion_routing/
200-
│ ├── cli/
201-
│ └── router/
202-
├── tests/
203-
│ ├── benchmarks/
204-
│ │ └── diffusion_router/
205-
│ │ ├── bench_router.py
206-
│ │ └── bench_routing_algorithms.py
207-
│ └── unit/
208-
├── pyproject.toml
209-
└── README.md
210-
```
254+
---
211255

212256
## Acknowledgment
213257

214258
This project is derived from [radixark/miles#544](https://github.com/radixark/miles/pull/544). Thanks to the original authors for their work.
215259

260+
---
261+
216262
## Notes
217263

218264
- Quarantined workers are intentionally not auto-reintroduced.

0 commit comments

Comments
 (0)