[router] Validate video generation before slection; Support selctive worker choice by zhaochenyang20 · Pull Request #14 · zhaochenyang20/sglang-diffusion-routing

zhaochenyang20 · 2026-02-22T18:04:49Z

This PR fixes a routing mismatch where /generate_video could be sent to image-only workers (e.g., Qwen/Qwen-Image), causing confusing “video request but image output” behavior.

Added per-worker capability cache: worker_video_support (True/False/None): On worker registration (add_worker and initial CLI --worker-urls), router probes /v1/models once and stores whether the worker is video-capable.
Updated /generate_video to route only to workers with support is True; if none exist, return 400 with a clear error.
Generalized routing path: Renamed _use_url to _select_worker_by_routing. Extended _forward_to_worker(..., worker_urls=None) so routing can be restricted to a worker subset.
Updated docs: Removed misleading generate_video example using Qwen/Qwen-Image. Clarified /generate_video behavior for image-only workers.

Keep a clean extension point for future scenarios where one router manages heterogeneous models and requests can target selected worker groups.

gemini-code-assist · 2026-02-22T18:05:06Z

Summary of Changes

Hello @zhaochenyang20, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the router's intelligence by preventing video generation requests from being misdirected to image-only workers, thereby improving user experience and system reliability. It introduces a mechanism to identify and cache worker capabilities, enabling more precise routing decisions. Additionally, the changes generalize the routing infrastructure to support more flexible worker selection based on specific request requirements.

Highlights

Worker Capability Caching: Implemented a per-worker capability cache (worker_video_support) to track if a worker supports video generation, populated during registration and initial CLI setup.
Video Request Routing Enforcement: Ensured that /generate_video requests are routed exclusively to workers marked as video-capable, returning a 400 error if no such workers are available.
Generalized Routing Path: Refactored the worker selection logic by renaming _use_url to _select_worker_by_routing and extending _forward_to_worker to allow routing to a specified subset of workers.
Documentation Updates: Updated documentation by removing a misleading generate_video example that used an image-only model and clarified the behavior of /generate_video for image-only workers.

Changelog

README.md
- Removed an outdated curl example for video generation that incorrectly used an image-only model.
- Updated the description for /generate_video endpoint to mention rejection of image-only workers.
- Removed SGLANG_USE_MODELSCOPE=TRUE environment variable from benchmark commands.
sglang
- Added sglang as a Git subproject.
src/sglang_diffusion_routing/cli/main.py
- Imported the asyncio module.
- Added a call to router._refresh_worker_video_support when workers are initially registered via CLI arguments.
src/sglang_diffusion_routing/router/diffusion_router.py
- Defined _IMAGE_TASK_TYPES constant for identifying image-only task types.
- Introduced worker_video_support dictionary to store video capability for each worker.
- Renamed _use_url to _select_worker_by_routing and added an optional worker_urls parameter to filter candidate workers.
- Modified _forward_to_worker to accept an optional worker_urls parameter for targeted routing.
- Implemented _probe_worker_video_support to asynchronously query worker capabilities via /v1/models.
- Implemented _refresh_worker_video_support to update the cached video capability for a worker.
- Updated the generate_video endpoint to filter for video-capable workers and return a 400 error if none are found.
- Initialized worker_video_support to None when a new worker is registered in register_worker.
- Triggered _refresh_worker_video_support after a worker is added via the add_worker API endpoint.

Activity

No specific activity (comments, reviews, progress updates) has been recorded for this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This PR introduces a valuable feature for routing video generation requests to capable workers, which is a great improvement. However, a critical security audit identified a significant Server-Side Request Forgery (SSRF) vulnerability in the worker registration process. The validation logic in _normalize_worker_url is incomplete, allowing registration of internal network addresses. This, combined with the lack of authentication on management endpoints like /add_worker, could allow an attacker to use the router as a proxy to attack internal services. Additionally, there is one critical issue and a couple of suggestions for improvement regarding performance and robustness, along with minor typos in the pull request title ('slection' -> 'selection', 'selctive' -> 'selective').

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

dreamyang-liu · 2026-02-22T19:10:01Z

+    if refresh_tasks:
+
+        async def _refresh_all_worker_video_support() -> None:
+            await asyncio.gather(*refresh_tasks)
+


Should we make this refresh task something like health check (periodical check)? One scenario I'm thinking is someone shutdown the worker and launch a new worker with same worker url, then the data from router would be stale.

Another reason to support updating type via /health: during the startup phase, _probe_worker_video_support may fail, which would prevent the worker from receiving any subsequent requests. However, if we continuously probe via /health, the worker can automatically rejoin the routing pool and handle requests once it recovers.

alphabetc1 · 2026-02-22T19:19:41Z

Noticed the sglang submodule update. Should we also add git submodule update --init --recursive to the README.md Installation section?

alphabetc1 · 2026-02-22T19:59:47Z

Maybe we should move the worker type probing logic into register_worker, so we don't need to probe separately for both initialization and add_worker scenarios. Also, cli/main.py is primarily for CLI handling, so placing router-internal logic here feels a bit odd.
Additionally, we could make type a basic worker attribute and return it in /list_workers @zhaochenyang20

validate video generation; support selective routing to limited workers

dc790d9

zhaochenyang20 requested a review from alphabetc1 February 22, 2026 18:04

gemini-code-assist Bot reviewed Feb 22, 2026

View reviewed changes

dreamyang-liu reviewed Feb 22, 2026

View reviewed changes

Comment thread src/sglang_diffusion_routing/router/diffusion_router.py Outdated

zhaochenyang20 and others added 4 commits February 22, 2026 10:53

Update src/sglang_diffusion_routing/router/diffusion_router.py

8980b54

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

fix gemni feedback

39affca

upd local

e0c8825

remove diff

7530682

dreamyang-liu reviewed Feb 22, 2026

View reviewed changes

alphabetc1 requested changes Feb 22, 2026

View reviewed changes

Comment thread src/sglang_diffusion_routing/router/diffusion_router.py

alphabetc1 reviewed Feb 22, 2026

View reviewed changes

Comment thread src/sglang_diffusion_routing/cli/main.py Outdated

alphabetc1 reviewed Feb 22, 2026

View reviewed changes

Comment thread src/sglang_diffusion_routing/cli/main.py Outdated

alphabetc1 reviewed Feb 22, 2026

View reviewed changes

Comment thread src/sglang_diffusion_routing/router/diffusion_router.py Outdated

use submodule and adds implmented update weights from disk api

283c3f2

zhaochenyang20 mentioned this pull request Feb 23, 2026

[Question] Tests is a little bit fake #5

Closed

zhaochenyang20 added 4 commits February 22, 2026 20:12

rename use_url to _select_worker_by_routing

b80b92b

fix feedback from shuwen

1bbdd06

fix layout

faed1c3

remove output

5fab8c0

zhaochenyang20 mentioned this pull request Feb 23, 2026

[Refactor] Formalized register worker #19

Closed

zhaochenyang20 merged commit 9b46df9 into main Feb 23, 2026
1 check passed

alphabetc1 mentioned this pull request Feb 23, 2026

[Bug] Video Generation Endpoint is wrong #10

Closed

zhaochenyang20 deleted the validate_video_generation branch February 23, 2026 19:20

alphabetc1 mentioned this pull request Feb 24, 2026

fix: prevent /generate 502 caused by event loop mismatch + add e2e tests #33

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[router] Validate video generation before slection; Support selctive worker choice#14

[router] Validate video generation before slection; Support selctive worker choice#14
zhaochenyang20 merged 10 commits intomainfrom
validate_video_generation

zhaochenyang20 commented Feb 22, 2026

Uh oh!

gemini-code-assist Bot commented Feb 22, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dreamyang-liu Feb 22, 2026

Uh oh!

alphabetc1 Feb 22, 2026

Uh oh!

alphabetc1 commented Feb 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alphabetc1 commented Feb 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zhaochenyang20 commented Feb 22, 2026

Uh oh!

gemini-code-assist Bot commented Feb 22, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dreamyang-liu Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

alphabetc1 Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

alphabetc1 commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alphabetc1 commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alphabetc1 commented Feb 22, 2026 •

edited

Loading

alphabetc1 commented Feb 22, 2026 •

edited

Loading