[model, feature] qwen3-omni: add packed sequence support and shared sequence utilities by hbhflw2000 · Pull Request #4304 · NVIDIA-NeMo/Megatron-Bridge

hbhflw2000 · 2026-06-11T11:19:03Z

What does this PR do?

Add Qwen3-Omni packed sequence training support and introduce shared raw sequence padding / packed-sequence metadata utilities for the Qwen3-Omni training path.

Changelog

Add Qwen3-Omni pack_sequences_in_batch=True forward-step support.
Preserve dense CP behavior by keeping raw input_ids available for model-internal mRoPE while slicing train tensors on CP ranks.
Add shared raw-batch sequence padding helpers in training/utils/padding_utils.py.
Add shared uniform PackedSeqParams construction in training/utils/packed_seq_utils.py.
Follow the existing Qwen3-VL packed-padding pattern WITHOUT changing Qwen3-VL code in this PR.
Add unit coverage for Qwen3-Omni packed sequence / CP behavior and shared sequence utilities.

Design note / RFC

This implementation follows the existing Qwen3-VL packed-padding pattern: pad raw batch sequence tensors to an aligned dense length, build uniform THD PackedSeqParams, and keep model-specific multimodal / mRoPE handling inside the Qwen3-Omni step and model code.

This PR intentionally does not reuse slice_batch_for_context_parallel for Qwen3-Omni raw-batch padding. That utility operates after embedding preparation and slices inputs_embeds, while Qwen3-Omni needs pre-forward raw sequence normalization so the full input_ids tensor remains available for multimodal placeholder handling and mRoPE.

The shared abstraction here is intentionally narrow: compute the padded target sequence length, pad/truncate common raw batch tensors, and construct uniform THD PackedSeqParams. Model-specific logic such as multimodal merge, CP rank slicing, and mRoPE handling remains in Qwen3-Omni code.

ATTENTION: Qwen3-VL code is intentionally left unchanged in this PR. Applying these helpers back to Qwen3-VL can be considered separately with Qwen3-VL-specific regression coverage.

Validation

Unit tests:

pytest tests/unit_tests/training/utils/test_padding_utils.py tests/unit_tests/training/utils/test_packed_seq_utils.py
# 16 passed

pytest tests/unit_tests/models/qwen_omni/test_qwen3_omni_step.py tests/unit_tests/models/qwen_omni/modeling_qwen3_omni/test_omni_model.py
# 27 passed

E2E validation:
4-node / 32-GPU Qwen3-Omni packed sequence full-model training passed:
Parallel config: TP=2, PP=2, CP=2, EP=4, SP=True.
Training config: seq_length=16384, global_batch_size=16, micro_batch_size=2, train_iters=200.
Result: completed 200 steps with finite loss, stable grad norm, and stable throughput.

copy-pr-bot · 2026-06-11T11:19:07Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: hbhflw2000 <417911774@qq.com>

github-actions Bot added the community-request label Jun 11, 2026

hbhflw2000 added 2 commits June 11, 2026 19:23

[model] Add Qwen3-Omni packed sequence utilities

f01e7d3

Signed-off-by: hbhflw2000 <417911774@qq.com>

[model] Keep Qwen3-VL packed path unchanged

042eed1

Signed-off-by: hbhflw2000 <417911774@qq.com>

hbhflw2000 force-pushed the pr4_omni3_packseq_sequence_utils branch from f0f95d8 to 042eed1 Compare June 11, 2026 11:27

yaoyu-33 added area:model Model implementations and HF bridge logic feature New capabilities, enhancements, or enablement work needs-review PR is ready for code review and waiting on a reviewer labels Jun 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[model, feature] qwen3-omni: add packed sequence support and shared sequence utilities#4304

[model, feature] qwen3-omni: add packed sequence support and shared sequence utilities#4304
hbhflw2000 wants to merge 2 commits into
NVIDIA-NeMo:mainfrom
hbhflw2000:pr4_omni3_packseq_sequence_utils

hbhflw2000 commented Jun 11, 2026 •

edited

Loading

Uh oh!

copy-pr-bot Bot commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hbhflw2000 commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Changelog

Design note / RFC

Validation

Uh oh!

copy-pr-bot Bot commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hbhflw2000 commented Jun 11, 2026 •

edited

Loading