Skip to content

[test] test: add model contract matrix#4376

Open
yaoyu-33 wants to merge 2 commits into
mainfrom
yuya/mb-544-model-contract-matrix
Open

[test] test: add model contract matrix#4376
yaoyu-33 wants to merge 2 commits into
mainfrom
yuya/mb-544-model-contract-matrix

Conversation

@yaoyu-33

@yaoyu-33 yaoyu-33 commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

Starts the MB-544 P0 TODO #1 implementation by adding a shared config-only AutoBridge/provider contract matrix for a small high-risk slice.

Covered cases:

  • Qwen3-ASR nested thinker/text/audio config path, including thinker_config.audio_config.d_model export reconstruction coverage for the NVBug 6314636 class of failure. The test also checks the thinker config run-config serialization hook.
  • Step35 exact HF architecture registration with the published 45 main hidden layers plus 3 extra MTP layer types. The matrix now verifies Step35Config splits main/MTP layer types and the bridge restores the provider layer-type list and MoE layer frequency for the 45-layer decoder.
  • MiMo-V2-Flash exact HF architecture registration and config-only provider construction for dual full-attention/SWA query-group fields, v_head_dim, window_size, and mtp_num_layers=0. This does not cover FP8 dequant, weight conversion, MTP weight mapping, or runtime execution.
  • NemotronLabsDiffusion exact HF architecture registration and nested text_config provider construction for hidden/FFN size, layer count, vocab, tied embeddings, rotary_base, and retained hf_config. This does not cover diffusion runtime, conversion/export roundtrip, or the lazy config monkeypatch path.
  • Nemotron-VL nested llm_config provider construction for the cheap high-risk VLM onboarding path: hidden/FFN size, attention/query groups, vocab/sequence length, vocab divisor, and provider flags.

The test stays unit-only and does not download weights or instantiate distributed models. It exercises AutoBridge support, registered bridge resolution, and to_megatron_provider(load_weights=False) provider construction. New cases should be onboarded by adding a tiny HF config factory, real architecture string, bridge/provider symbols, provider-field assertions, and a focused extra assertion documenting the exact boundary.

Validation

Passed:

  • ruff check tests/unit_tests/models/test_model_contract_matrix.py src/megatron/bridge/models/conversion/utils.py src/megatron/bridge/models/qwen3_asr/hf_qwen3_asr/configuration_qwen3_asr.py
  • ruff format --check tests/unit_tests/models/test_model_contract_matrix.py src/megatron/bridge/models/conversion/utils.py src/megatron/bridge/models/qwen3_asr/hf_qwen3_asr/configuration_qwen3_asr.py
  • git diff --check
  • /usr/bin/python3 -m py_compile tests/unit_tests/models/test_model_contract_matrix.py src/megatron/bridge/models/conversion/utils.py src/megatron/bridge/models/qwen3_asr/hf_qwen3_asr/configuration_qwen3_asr.py
  • Remote dependency-complete container: uv sync --group test && uv run --no-sync python -m pytest tests/unit_tests/models/test_model_contract_matrix.py -q -> 5 passed, 28 warnings
  • Remote dependency-complete container: uv sync --group dev && uv run --no-sync pre-commit run --all-files -> all hooks passed

Regression proof:

  • Qwen3-ASR reproduction probe: old non-recursive config reconstruction produces audio_config.d_model=1280; patched recursive reconstruction preserves audio_config.d_model=1024.

Local workstation note:

  • Local uv run ... remains blocked before test execution because the platform cannot install the pinned nvidia-resiliency-ext==0.6.0 wheel. The focused pytest and all-files pre-commit checks above were run in the dependency-complete remote container instead.

Remaining MB-544 P0 follow-ups

  • Expand the shared contract matrix with more cheap model/config cases where support is clear: Nemotron text variants, supported Gemma MoE/tokenizer-config paths, and diffusion FLUX/WAN where config-only provider construction is representative.
  • Add the separate P0 example-smoke manifest.
  • Add the separate P0 conversion/export roundtrip matrix for runtime workflows.
  • Keep unsupported or environment-only cases out of this unit matrix; document them as release-container validation or docs-not-a-test items instead.

Linear: MB-544

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
@yaoyu-33 yaoyu-33 added ci CI, automation, test queue, or workflow infrastructure work area:model Model implementations and HF bridge logic needs-review PR is ready for code review and waiting on a reviewer area:diffusion DFM module labels Jun 15, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 15, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@claude

claude Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Light Code Review - LGTM. All referenced symbols exist. No critical bugs. Minor: test accesses bridge._model_bridge (private attr). Suggested test cases: No perf tests impacted.

@claude

claude Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Detailed notes: (1) bridge._model_bridge (line 261) is a private attribute - if renamed, this test breaks. Consider exposing the resolved bridge type via a public property on AutoBridge. (2) _make_mimo_v2_flash_config() and _make_nemotron_labs_diffusion_config() use bare PretrainedConfig rather than model-specific config classes. This works because AutoBridge.supports() only checks the architectures field, but does not exercise any custom config validation. Acceptable for a contract test but worth documenting in the follow-ups.

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
@yaoyu-33

Copy link
Copy Markdown
Contributor Author

/ok to test 73bd639

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:diffusion DFM module area:model Model implementations and HF bridge logic ci CI, automation, test queue, or workflow infrastructure work needs-review PR is ready for code review and waiting on a reviewer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant