Skip to content

[RFC] refactor: RESTful and sgl/smg compliant API#17

Merged
zhaochenyang20 merged 5 commits intozhaochenyang20:mainfrom
alphabetc1:refactor/api
Feb 27, 2026
Merged

[RFC] refactor: RESTful and sgl/smg compliant API#17
zhaochenyang20 merged 5 commits intozhaochenyang20:mainfrom
alphabetc1:refactor/api

Conversation

@alphabetc1
Copy link
Copy Markdown
Collaborator

to #13

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @alphabetc1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the router's API to align with RESTful principles and SGLang/SMG compliance. The changes introduce a more organized and standardized set of endpoints for interacting with the diffusion workers, covering inference, worker management, and health monitoring. The accompanying documentation has been thoroughly updated to reflect these new API specifications, making the router easier to understand and integrate.

Highlights

  • API Refactoring: The router's API has been refactored to be more RESTful and compliant with SGLang/SMG standards, introducing structured endpoints for inference, video result queries, model discovery, health checks, and worker management.
  • Documentation Enhancement: The README.md file has been significantly updated to include a comprehensive Table of Contents and detailed documentation for the new API endpoints, improving clarity and usability.
  • Router Features: Existing router features such as least-request routing, background health checks with quarantine, and update_weights_from_disk broadcasting remain core functionalities, now better documented within the new API structure.
Changelog
  • README.md
    • Added a comprehensive Table of Contents for improved navigation.
    • Introduced detailed sections for Router API, categorizing endpoints into Inference, Video Result Query, Model Discovery and Health Checks, Worker Management, and Optional APIs.
    • Replaced the previous bulleted list of API endpoints with structured tables, providing method, path, and description for each endpoint.
    • Removed the 'Project Layout' section.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request proposes a refactoring of the API to be more RESTful, as documented in the README.md. While the new API design is a good step forward, this PR only contains documentation changes and lacks the corresponding implementation. This has led to a few issues:

  1. The README.md is now out of sync with the actual code, documenting an API that doesn't exist.
  2. The README.md is internally inconsistent, with the Quick Start section using the old API endpoints while the Router API section describes the new ones.
  3. The helpful Project Layout section has been removed.

Given this is an RFC, my feedback is that the API design is good, but it should be implemented. The documentation should not be merged in this state without the code, as it's misleading. I've added specific comments on the README.md.

Comment thread README.md
Comment on lines 154 to +193
## Router API

- `POST /add_worker`: add worker via query (`?url=`) or JSON body.
- `GET /list_workers`: list registered workers.
- `GET /health`: aggregated router health.
- `GET /health_workers`: per-worker health and active request counts.
- `POST /generate`: forwards to worker `/v1/images/generations`.
- `POST /generate_video`: forwards to worker `/v1/videos`.
- `POST /update_weights_from_disk`: broadcast to healthy workers.
- `GET|POST|PUT|DELETE /{path}`: catch-all proxy forwarding.
### Inference Endpoints

| Method | Path | Description |
|---|---|---|
| `POST` | `/v1/images/generations` | Entrypoint for text-to-image generation |
| `POST` | `/v1/videos` | Entrypoint for text-to-video generation |

### Videos Result Query

| Method | Path | Description |
|---|---|---|
| `GET` | `/v1/videos` | List or poll video jobs |
| `GET` | `/v1/videos/{video_id}` | Get status/details of a single video job |
| `GET` | `/v1/videos/{video_id}/content` | Download generated video content |

### Model Discovery and Health Checks

| Method | Path | Description |
|---|---|---|
| `GET` | `/v1/models` | OpenAI-style model discovery |
| `GET` | `/health` | Basic health probe |

### Worker Management APIs

| Method | Path | Description |
|---|---|---|
| `POST` | `/workers` | Register a worker |
| `GET` | `/workers` | List workers (including health/load) |
| `GET` | `/workers/{worker_id}` | Get worker details |
| `PUT` | `/workers/{worker_id}` | Update worker configuration |
| `DELETE` | `/workers/{worker_id}` | Deregister a worker |

### Optional (business-dependent)

| Method | Path | Description |
|---|---|---|
| `POST` | `/update_weights_from_disk` | Reload weights from disk (ops/admin use) |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This new API documentation is a great improvement in terms of being RESTful and well-structured. However, the code in this repository has not been updated to implement these new endpoints. The router still uses the old endpoints (e.g., /add_worker, /generate). This makes the documentation inaccurate. The implementation should be updated to match this documentation, or this documentation should be marked as a proposal for a future version.

Comment thread README.md Outdated
Comment on lines +13 to +16
- [Quick Start](#quick-start)
- [Start diffusion workers](#start-diffusion-workers)
- [Start the router](#start-the-router)
- [Test the router](#test-the-router)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The Quick Start section, which is linked here, still uses the old API endpoints (e.g., /add_worker, /generate). This is inconsistent with the new Router API section. The examples in Quick Start should be updated to use the new API endpoints for consistency.

Comment thread README.md
Comment on lines 193 to -210
@@ -165,6 +224,8 @@
}
```

---

## Benchmark Scripts

Benchmark scripts are available under `tests/benchmarks/diffusion_router/` and are intended for manual runs.
@@ -190,29 +251,14 @@
--max-concurrency 4
```

## Project Layout

```text
.
├── docs/
│ └── update_weights_from_disk.md
├── src/sglang_diffusion_routing/
│ ├── cli/
│ └── router/
├── tests/
│ ├── benchmarks/
│ │ └── diffusion_router/
│ │ ├── bench_router.py
│ │ └── bench_routing_algorithms.py
│ └── unit/
├── pyproject.toml
└── README.md
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This Project Layout section provided a helpful overview of the repository structure. Its removal might make it harder for new contributors to navigate the codebase. Please consider restoring it or moving this information to a contributing guide.

Comment thread README.md Outdated
- `is_dead` (boolean): quarantine (`true`) or recover (`false`) this worker.
- `refresh_video_support` (boolean): re-probe worker `/v1/models` capability.

### Optional (business-dependent)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is RL related API

@zhaochenyang20
Copy link
Copy Markdown
Owner

  1. I fixed the bug of [sglang-d-router] error: 'FastAPI' object has no attribute 'add_event_handler'
  2. I refactored the code. Ensure that the router starts to serve after all the servers are started.

@zhaochenyang20
Copy link
Copy Markdown
Owner

image

Made a banner

@zhaochenyang20 zhaochenyang20 merged commit 74604ca into zhaochenyang20:main Feb 27, 2026
2 checks passed
@alphabetc1 alphabetc1 deleted the refactor/api branch February 27, 2026 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants