fix: prevent /generate 502 caused by event loop mismatch + add e2e tests by alphabetc1 · Pull Request #33 · zhaochenyang20/sglang-diffusion-routing

alphabetc1 · 2026-02-24T07:20:10Z

Motivation

1) Fix `/generate` returning 502 (introduced by #14)

There’s a bug caused by #14 where /generate may return 502.

Root cause (in plain English):

asyncio.run() spins up a temporary event loop, and closes it once the coroutine finishes.
httpx.AsyncClient was created/used inside that temporary loop.
Later when Uvicorn starts serving requests, it runs on a different event loop.
So when the router forwards requests, it ends up using an async client tied to the old (already closed) loop → Event loop is closed.
The router catches that exception and returns 502.

2) Add E2E tests

Added:

test_e2e_sglang.py: real E2E, but won’t be wired into CI for now (it’s heavier / depends on external runtime).
test_fake_e2e.py: lightweight fake E2E, and will run in CI to cover the basic end-to-end flow.

See development.md for usage / how to run them locally.

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code with pre-commit run --all-files.
Add or update unit tests if applicable.
Update documentation if applicable.
Provide accuracy and performance benchmark results if applicable.
Ensure all CI checks pass.

Review Process

Get approvals from maintainers and other reviewers.
Ensure all CI tests pass.
After green CI and required approvals, ask maintainers to merge.

gemini-code-assist · 2026-02-24T07:20:33Z

Summary of Changes

Hello @alphabetc1, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a bug in the video support refresh mechanism by moving it to an asynchronous background task, preventing router startup delays. It significantly enhances testing capabilities by introducing a new end-to-end test suite for real SGLang workers and a fake worker for CPU-only integration tests, ensuring the router's stability and correct behavior across various scenarios.

Highlights

Video Support Refresh Bugfix: The video support refresh logic, previously blocking router startup in main.py, has been refactored. It is now asynchronously integrated into the router's background health check loop in diffusion_router.py, ensuring non-blocking and continuous probing of worker video capabilities.
New End-to-End (E2E) Test Suite: A comprehensive E2E test suite (tests/e2e/test_e2e_sglang.py) has been introduced to validate the router's functionality with real SGLang diffusion workers. These tests cover image and video generation, load balancing, and proxying, with automatic skipping if SGLang or GPU resources are not available.
Fake Worker for Unit E2E Testing: A fake_worker.py module was added to simulate SGLang diffusion workers. This enables robust, CPU-only end-to-end tests (tests/unit/test_fake_e2e.py) for the router without requiring actual GPU hardware or SGLang installations, improving test coverage and development efficiency.
Router Robustness for Video Generation: The generate_video endpoint in the router now explicitly checks if any workers are registered. If no workers are available, it returns a 503 Service Unavailable error, preventing potential issues and providing clearer feedback.
Updated Test Infrastructure and Documentation: The development.md file has been updated with instructions for running the new E2E tests, and pyproject.toml has been configured to include the new E2E test paths for pytest discovery.

Changelog

development.md
- Updated documentation to include instructions for running real E2E tests, specifying GPU requirements and explicit command.
pyproject.toml
- Configured pytest to discover and run tests from the new tests/e2e directory in addition to tests/unit.
src/sglang_diffusion_routing/cli/main.py
- Removed the synchronous refresh_worker_video_support calls during router initialization to prevent blocking startup.
src/sglang_diffusion_routing/router/diffusion_router.py
- Refactored the video support capability probing for pre-registered workers into the asynchronous _start_background_health_check loop.
- Added a check in generate_video to return a 503 status code if no workers are registered in the pool.
tests/conftest.py
- Added a pytest configuration file to ensure the local src directory is prioritized in sys.path for test imports.
tests/e2e/test_e2e_sglang.py
- Added a new end-to-end test suite for the SGLang diffusion router, covering image generation, video generation, load balancing, and worker health checks.
- Implemented helper functions for managing SGLang worker processes, checking health, and handling environment variables.
- Included skip conditions for tests if SGLang is not installed or no GPU is available.
tests/unit/fake_worker.py
- Added a FastAPI application that simulates an SGLang diffusion worker, providing mock responses for image and video generation, health checks, and weight updates.
- Included configurable parameters for simulated failure rates and latencies to test router robustness.
tests/unit/test_fake_e2e.py
- Implemented CPU-only end-to-end tests using the fake worker, validating router functionality such as health, worker registration, image/video generation, weight updates, load balancing (round-robin and least-request), proxying, and worker failure scenarios.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a bugfix for refreshing video support in workers and adds an extensive suite of end-to-end tests. While the changes improve the robustness of the worker probing logic, a Server-Side Request Forgery (SSRF) vulnerability was identified in the new background health check due to insufficient validation of worker URLs. Additionally, there is a minor suggestion regarding import style in the new test files.

gemini-code-assist · 2026-02-24T07:23:39Z

+            await asyncio.gather(
+                *(self.refresh_worker_video_support(url) for url in unknown_workers),
+                return_exceptions=True,
+            )


The _start_background_health_check function initiates network requests to worker URLs that are not sufficiently validated. The validation logic in normalize_worker_url (used when workers are registered) does not block private IP addresses (RFC 1918) and can be bypassed using a trailing dot in the hostname (e.g., 169.254.169.254.). This allows an attacker to probe internal network services or cloud metadata endpoints by registering malicious worker URLs via the /add_worker endpoint.

To remediate this, ensure that normalize_worker_url strictly validates that the hostname is not a private or loopback IP address and correctly handles trailing dots in hostnames.

gemini-code-assist · 2026-02-24T07:23:39Z

+import base64
+import os
+import signal
+import socket
+import subprocess
+import sys
+import time
+from pathlib import Path


Per PEP 8, imports should be at the top of the file. concurrent.futures is currently imported inside test methods (lines 544, 664, 686). It should be imported only once at the top level with other standard library modules. Please also remove the local imports from the test methods.

Suggested change

import base64

import os

import signal

import socket

import subprocess

import sys

import time

from pathlib import Path

import base64

import concurrent.futures

import os

import signal

import socket

import subprocess

import sys

import time

from pathlib import Path

References

Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants. ^(link)

dreamyang-liu · 2026-02-25T08:36:12Z

+            await asyncio.gather(
+                *(self.refresh_worker_video_support(url) for url in unknown_workers),
+                return_exceptions=True,
+            )


I still feel like it shouldn't be an one-off check, we might want to have one specific loop task to check it periodically. Also putting this refresh here changes the semnaitc of _start_background_health_check function.

agree. Maybe we can open a separate PR to implement this, and keep this PR focused on decoupling the video support logic from cli/main.py.

dreamyang-liu · 2026-02-25T08:36:16Z

@@ -0,0 +1,489 @@
+"""
+End-to-end tests with real sglang diffusion workers.


This line is not necessary, instead can we describe the test logistic here ?

dreamyang-liu · 2026-02-25T08:36:18Z

+_WEBP_RIFF_SIGNATURE = b"RIFF"
+_WEBP_FORMAT_MARKER = b"WEBP"
+
+


I feel like a lot function and class can reuse the one in https://github.com/zhaochenyang20/sglang-diffusion-routing/blob/main/src/sglang_diffusion_routing/launcher/utils.py

Can we try to reduce the duplicate code ?

dreamyang-liu · 2026-02-25T08:36:25Z

@@ -0,0 +1,167 @@
+#!/usr/bin/env python3
+"""
+Fake sglang diffusion worker for e2e testing.


The comment is a bit confusing. This file is under unit, probably e2e cpu test and move it to e2e.

Also might worth discussing is it really necessary to have this fake worker. Given self.client in DiffusionRouter is the only interface between router and worker, can we hijack the client to mock different behaviors from worker rather than implement this worker ?

Split the tests into tests/unit, tests/integration , and tests/e2e

Suite Directory Scope Hardware Runs in CI

Unit tests/unit Function-level tests CPU Yes

Integration tests/integration End-to-end flow/link tests CPU Yes

E2E tests/e2e Real end-to-end tests CPU + GPU No

gemini-code-assist Bot reviewed Feb 24, 2026

View reviewed changes

alphabetc1 changed the title ~~fix: refresh video support bugfix & add e2e test~~ fix: prevent /generate 502 caused by event loop mismatch + add e2e tests Feb 24, 2026

alphabetc1 requested a review from zhaochenyang20 February 24, 2026 14:56

dreamyang-liu reviewed Feb 25, 2026

View reviewed changes

dreamyang-liu suggested changes Feb 25, 2026

View reviewed changes

alphabetc1 added 5 commits February 27, 2026 20:28

fix: refresh video support bugfix & add e2e test

1949643

lint

a8e2019

add test/intgration

532005b

fix lint

5b4a3cd

fix API

3904852

alphabetc1 force-pushed the ci/add_more_testcase branch from 99b0e38 to 3904852 Compare February 27, 2026 13:43

reuse utils.py

2cd7a92

alphabetc1 requested a review from dreamyang-liu March 3, 2026 06:40

zhaochenyang20 merged commit 634aba8 into zhaochenyang20:main Mar 4, 2026
2 checks passed

alphabetc1 deleted the ci/add_more_testcase branch March 4, 2026 05:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent /generate 502 caused by event loop mismatch + add e2e tests#33

fix: prevent /generate 502 caused by event loop mismatch + add e2e tests#33
zhaochenyang20 merged 6 commits intozhaochenyang20:mainfrom
alphabetc1:ci/add_more_testcase

alphabetc1 commented Feb 24, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Feb 24, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Feb 24, 2026

Uh oh!

gemini-code-assist Bot Feb 24, 2026

Uh oh!

dreamyang-liu Feb 25, 2026

Uh oh!

alphabetc1 Feb 27, 2026

Uh oh!

dreamyang-liu Feb 25, 2026

Uh oh!

alphabetc1 Feb 27, 2026

Uh oh!

dreamyang-liu Feb 25, 2026

Uh oh!

alphabetc1 Feb 27, 2026

Uh oh!

dreamyang-liu Feb 25, 2026

Uh oh!

alphabetc1 Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,489 @@
		"""
		End-to-end tests with real sglang diffusion workers.

Suite	Directory	Scope	Hardware	Runs in CI
Unit	tests/unit	Function-level tests	CPU	Yes
Integration	tests/integration	End-to-end flow/link tests	CPU	Yes
E2E	tests/e2e	Real end-to-end tests	CPU + GPU	No

		_WEBP_RIFF_SIGNATURE = b"RIFF"
		_WEBP_FORMAT_MARKER = b"WEBP"

Conversation

alphabetc1 commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

1) Fix /generate returning 502 (introduced by #14)

2) Add E2E tests

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

Uh oh!

gemini-code-assist Bot commented Feb 24, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alphabetc1 commented Feb 24, 2026 •

edited

Loading

1) Fix `/generate` returning 502 (introduced by #14)