[WIP] Realtime websocket API by CTKnight · Pull Request #282 · sgl-project/sglang-omni

CTKnight · 2026-04-13T06:11:06Z

Motivation

This PR adds a prototype realtime API for interactive Qwen3-Omni conversations. It addresses #59 by enabling low-latency multi-turn interaction with server-side webrtcvad, automatic turn commits, and response interruption when the user starts speaking again.

Modifications

Adds a new /v1/realtime/ws endpoint plus the sglang_omni.realtime package for realtime session orchestration, backend abstractions, websocket event streaming, and streamed assistant audio output.
Implements realtime session behaviors for auto-VAD turns, manual push-to-talk, text-only turns in the same session, conversation history across turns, and barge-in / cancellation when a new utterance arrives while the assistant is responding.
Adds an OmniResponseBackend adapter on top of the existing omni client, plus a mock backend for browser smoke tests and frontend development.
Adds a standalone playground/realtime-ws frontend and launcher for microphone capture, text input, streamed assistant audio playback, and local mock-server testing.
Registers the realtime websocket route in the FastAPI server, adds the [realtime] extra (webrtcvad-wheels, websockets), and documents setup / usage in playground/README.md.

Related Issues

Addresses [Feature] Support Interactive Real-Time API #59

Accuracy Test

N/A. This PR adds transport, session orchestration, and playground code; it does not change model weights or kernel logic.

Benchmark & Profiling

Not included yet. This is a prototype realtime API / playground PR.

Validation

uv run pytest -q tests/test_realtime_audio_pipeline.py tests/test_realtime_backend_mock.py tests/test_realtime_backend_omni.py tests/test_realtime_media.py tests/test_realtime_session.py tests/test_realtime_utils.py tests/test_realtime_vad.py tests/test_realtime_ws_api.py tests/test_client_media_inputs.py
Result: 25 passed in 5.09s
Manual browser validation was also captured on the issue thread for Qwen3-Omni-30B-A3B-Instruct: [Feature] Support Interactive Real-Time API #59 (comment)

Screenshot:

Video:

Screen.Recording.sglang.ws.mov

Note: on macOS, the returned assistant audio was not recorded in the capture, but the UI's assistant audio level indicates audio playback.

Checklist

Format your code according with pre-commit.
Add unit tests.
Update documentation / docstrings / example tutorials as needed.
Provide throughput / latency benchmark results and accuracy evaluation results as needed. Not applicable for this prototype transport/playground PR.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.

FrankLeeeee · 2026-04-14T03:44:42Z

Can you include some description of your work?

- tests for early consumer exit and normal completion

Fix orphaned request leaks on stream aborts

# Conflicts: # playground/README.md # pyproject.toml

…pe/webrtc-vad

This reverts commit 4d5387e.

PopSoda2002 · 2026-04-21T03:35:32Z

Hi @CTKnight I am really interested in this feature, can I ask for progress and collaboration?

CTKnight added 16 commits April 11, 2026 21:37

Add WebRTC VAD realtime prototype

397a2dc

Add realtime WebRTC mock demo prototype

1ab2dfc

Fix realtime audio ingest and add diagnostics

576a3b1

Remove temporary audio dump checkpoints

fc74db9

Trim frontend audio debug UI

783a40e

Add auto VAD and push-to-talk input modes

4876745

Add server-side barge-in for realtime audio

8c02040

Add realtime text turns and conversation transcript

25d600b

use exsting script to run realtime demo

cffc114

ice server info gathering

a7e4206

public ip turn

e10cd59

websocket impl

83a4704

ws launcher backend entry

fadb5a3

fix websocket realtime audio payload

0b91b90

remove webrtc transport

2c0d427

fix realtime text delta normalization

e15a200

CTKnight changed the title ~~Prototype/webrtc vad~~ [WIP] Realtime websocket API Apr 15, 2026

rycerzes and others added 8 commits April 17, 2026 10:24

fix stream abort handling resulting in orphaned request leaks

1f22ead

- tests for early consumer exit and normal completion

Merge pull request #1 from rycerzes/webrtc-vad

d41b7c8

Fix orphaned request leaks on stream aborts

Merge remote-tracking branch 'upstream/main' into prototype/webrtc-vad

f00a783

# Conflicts: # playground/README.md # pyproject.toml

Merge remote-tracking branch 'fork/prototype/webrtc-vad' into prototy…

734f5d2

…pe/webrtc-vad

Package realtime websocket deps in base install

b2cd6c0

Replay captured audio in mock realtime backend

5c4d504

Fix realtime speech branch completion

4d5387e

Revert "Fix realtime speech branch completion"

7f00ed1

This reverts commit 4d5387e.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Realtime websocket API#282

[WIP] Realtime websocket API#282
CTKnight wants to merge 24 commits intosgl-project:mainfrom
CTKnight:prototype/webrtc-vad

CTKnight commented Apr 13, 2026 •

edited

Loading

Uh oh!

FrankLeeeee commented Apr 14, 2026

Uh oh!

PopSoda2002 commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

CTKnight commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Related Issues

Accuracy Test

Benchmark & Profiling

Validation

Checklist

Uh oh!

FrankLeeeee commented Apr 14, 2026

Uh oh!

PopSoda2002 commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CTKnight commented Apr 13, 2026 •

edited

Loading