[WIP][CI] Add CI for Ming Omni by edwingao28 · Pull Request #348 · sgl-project/sglang-omni

edwingao28 · 2026-04-25T00:19:42Z

Motivation

Closes #295. This PR builds the multi-stage Ming-flash-omni-2.0 CI suite mirroring the qwen3-omni structure and covers the following metrics: TTFA, RTF, WER, plus MMMU and MMSU accuracy

Modification

Workflow DAG

  docs ──► stage-1-thinker ──► stage-2-tts
                            ├─► stage-3-mmmu
                            └─► stage-4-mmsu

stage-1-thinker — text generation

Pure text path, thinker only. tests/test_model/test_ming_omni_thinker_length.py exercises Ming thinker across short / medium / long prompts and asserts response shape.

stage-2-tts — text-to-speech

RTF performance + non-stream TTFA baseline diagnostic + WER accuracy. Single job with two tests sharing one server (test_tts_non_streaming_perf + test_tts_wer).

tests/test_model/test_ming_omni_tts_ci.py uses voice_clone=False

stage-3-mmmu — image recognition

Image+text → text via thinker only. tests/test_model/test_ming_omni_mmmu_ci.py runs mmmu-ci-50 with warmup=2

stage-4-mmsu — audio-in understanding

Text+audio → text via MMSU benchmark with modalities="text+audio". This covers the #295 ASR requirement as a superset tests/test_model/test_ming_omni_mmsu_ci.py runs thinker-only (talker OFF).

Accuracy Test

Benchmark & Profiling

benchmark & ci threshold in progress

Checklist

Format your code according with pre-commit.
Add unit tests.
Update documentation / docstrings / example tutorials as needed.
Provide throughput / latency benchmark results and accuracy evaluation results as needed.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.

… default

edwingao28 added 4 commits April 24, 2026 16:51

[CI] Bootstrap Ming-Omni MMMU CI with omni-setup composite

bff508f

[CI] Add Ming-Omni docs / thinker / tts / mmmu / mmsu stages

4e915e3

[Benchmark] Add TTFA metrics and diagnostics to omni TTS framework

d1300a3

[Ming-Omni] Align launcher with Qwen3: TP, thinker length, cuda graph…

b2067ef

… default

edwingao28 requested review from FrankLeeeee and shuaills as code owners April 25, 2026 00:19

cleanup

801902d

edwingao28 force-pushed the feat/ming-mmmu-ci branch from 9daec7e to 801902d Compare April 25, 2026 00:21

edwingao28 mentioned this pull request Apr 25, 2026

[Feature] Ming-Omni ci test supporting #295

Open

2 tasks

edwingao28 added 5 commits April 24, 2026 17:54

[CI] Bound Ming MMMU decode length

b0399cd

[CI] Keep Ming MMSU output text-only

4cedc57

[CI] Format Ming launch defaults lint guard

d9112da

[CI] Double Ming docs startup timeout

9286f45

[CI] Double Ming stage startup timeouts

f1cd48a

edwingao28 changed the title ~~[WIP][Feat] Add CI for Ming Omni~~ [WIP][CI] Add CI for Ming Omni Apr 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][CI] Add CI for Ming Omni#348

[WIP][CI] Add CI for Ming Omni#348
edwingao28 wants to merge 10 commits intosgl-project:mainfrom
edwingao28:feat/ming-mmmu-ci

edwingao28 commented Apr 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

edwingao28 commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modification

Workflow DAG

stage-1-thinker — text generation

stage-2-tts — text-to-speech

stage-3-mmmu — image recognition

stage-4-mmsu — audio-in understanding

Accuracy Test

Benchmark & Profiling

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

edwingao28 commented Apr 25, 2026 •

edited

Loading