feat: add extract_submodel parameter to build_encoder_backbone by oliverholworthy · Pull Request #1838 · NVIDIA-NeMo/Automodel

oliverholworthy · 2026-04-14T16:33:35Z

What does this PR do ?

Add an extract_submodel parameter to build_encoder_backbone for generic VLM text backbone extraction via dotted attribute path, replacing the need for model-specific extraction code like _from_vlm_checkpoint.

Changelog

Add extract_submodel: str | None parameter to build_encoder_backbone() in the generic (non-SUPPORTED_BACKBONES) code path
When set, walks the dotted attribute path to extract a submodel after loading (e.g. "language_model" extracts the text backbone from a VLM)
Different VLM families use different attribute names (.language_model, .text_model, .model.language_model) — the dotted-path approach handles this via config rather than per-architecture code
Add tests/unit_tests/_transformers/test_retrieval.py — new test module for build_encoder_backbone

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?

Additional Information

The parameter flows through the existing YAML config system via **kwargs passthrough:

model:
  _target_: nemo_automodel.NeMoAutoModelBiEncoder.from_pretrained
  pretrained_model_name_or_path: mistralai/Ministral-3-3B-Base-2512
  extract_submodel: language_model
  pooling: avg
  l2_normalize: true

extract_submodel is a named parameter on build_encoder_backbone, so it is consumed there and not forwarded to HF's from_pretrained.

Companion to #1837 (is_causal refactor). These two PRs are independent and can be reviewed/merged in any order.

copy-pr-bot · 2026-04-14T16:33:40Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

Add a dotted attribute path parameter for extracting a submodel from a loaded checkpoint. This enables generic VLM text backbone extraction without model-specific code: build_encoder_backbone("mistralai/Ministral-3-3B-Base-2512", task="embedding", extract_submodel="language_model") Different VLM families use different attribute names (.language_model, .text_model, .model.language_model) — the dotted-path approach handles this via config rather than per-architecture code. Validates that the extracted submodel has a .config attribute (i.e. is a PreTrainedModel), raising ValueError if not. Includes round-trip tests with both a tiny Mistral3 VLM config and the real Ministral-3-3B weights. Signed-off-by: Oliver Holworthy <1216955+oliverholworthy@users.noreply.github.com>

akoumpa · 2026-04-19T02:05:43Z

/ok to test b44785f

oliverholworthy force-pushed the oholworthy/extract_submodel branch 2 times, most recently from ff5428d to e965d38 Compare April 15, 2026 09:59

oliverholworthy force-pushed the oholworthy/extract_submodel branch from e965d38 to bba8298 Compare April 15, 2026 14:53

Merge branch 'main' into oholworthy/extract_submodel

b44785f

copy-pr-bot bot temporarily deployed to nemo-ci April 19, 2026 02:06 Inactive

copy-pr-bot bot temporarily deployed to test April 19, 2026 02:06 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci April 19, 2026 03:58 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci April 19, 2026 04:06 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci April 19, 2026 04:30 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add extract_submodel parameter to build_encoder_backbone#1838

feat: add extract_submodel parameter to build_encoder_backbone#1838
oliverholworthy wants to merge 2 commits intomainfrom
oholworthy/extract_submodel

oliverholworthy commented Apr 14, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Apr 14, 2026

Uh oh!

akoumpa commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

oliverholworthy commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Changelog

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Apr 14, 2026

Uh oh!

akoumpa commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

oliverholworthy commented Apr 14, 2026 •

edited

Loading