[router] Validate video generation before slection; Support selctive worker choice#14
Conversation
Summary of ChangesHello @zhaochenyang20, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the router's intelligence by preventing video generation requests from being misdirected to image-only workers, thereby improving user experience and system reliability. It introduces a mechanism to identify and cache worker capabilities, enabling more precise routing decisions. Additionally, the changes generalize the routing infrastructure to support more flexible worker selection based on specific request requirements. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This PR introduces a valuable feature for routing video generation requests to capable workers, which is a great improvement. However, a critical security audit identified a significant Server-Side Request Forgery (SSRF) vulnerability in the worker registration process. The validation logic in _normalize_worker_url is incomplete, allowing registration of internal network addresses. This, combined with the lack of authentication on management endpoints like /add_worker, could allow an attacker to use the router as a proxy to attack internal services. Additionally, there is one critical issue and a couple of suggestions for improvement regarding performance and robustness, along with minor typos in the pull request title ('slection' -> 'selection', 'selctive' -> 'selective').
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
| if refresh_tasks: | ||
|
|
||
| async def _refresh_all_worker_video_support() -> None: | ||
| await asyncio.gather(*refresh_tasks) | ||
|
|
There was a problem hiding this comment.
Should we make this refresh task something like health check (periodical check)? One scenario I'm thinking is someone shutdown the worker and launch a new worker with same worker url, then the data from router would be stale.
There was a problem hiding this comment.
Another reason to support updating type via /health: during the startup phase, _probe_worker_video_support may fail, which would prevent the worker from receiving any subsequent requests. However, if we continuously probe via /health, the worker can automatically rejoin the routing pool and handle requests once it recovers.
|
Noticed the sglang submodule update. Should we also add |
|
Maybe we should move the worker type probing logic into |
This PR fixes a routing mismatch where /generate_video could be sent to image-only workers (e.g.,
Qwen/Qwen-Image), causing confusing “video request but image output” behavior.worker_video_support(True/False/None): On worker registration (add_workerand initial CLI--worker-urls), router probes/v1/modelsonce and stores whether the worker is video-capable./generate_videoto route only to workers with support is True; if none exist, return 400 with a clear error._use_urlto_select_worker_by_routing. Extended_forward_to_worker(..., worker_urls=None)so routing can be restricted to a worker subset.generate_videoexample usingQwen/Qwen-Image. Clarified/generate_videobehavior for image-only workers.Keep a clean extension point for future scenarios where one router manages heterogeneous models and requests can target selected worker groups.