[Draft] Peft Bridge #1766

yaoyu-33 · 2025-12-18T03:39:21Z

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Changelog

Add specific line by line info of high level changes in this PR.

GitHub Actions CI

See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

Related to # (issue)

Signed-off-by: yaoyu-33 <[email protected]>

copy-pr-bot · 2025-12-18T03:39:25Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: yaoyu-33 <[email protected]>

src/megatron/bridge/models/conversion/peft_bridge.py

src/megatron/bridge/models/conversion/model_bridge.py

HollowMan6 · 2025-12-18T22:43:43Z

Update: now all are fixed

~~Update: Only linear_proj and linear_fc2 are correctly mapped with current version for LoRA:~~

~~Looks like those fused weights (e.g. fc1, qkv) in LoRA are still not mapped to hf_name correctly and they are still using the megatron naming:~~

~~Meanwhile, though Canonical LoRA is functioning in the sense of naming, it doesn't seem like it's working correctly, I will investigate this further:~~

Co-authored-by: ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 <[email protected]> Signed-off-by: Yu Yao <[email protected]>

yaoyu-33 · 2025-12-19T00:50:58Z

@HollowMan6 yes, I think I messed up a bit about the name conversion for fused base names. lemme try to fix

Signed-off-by: yaoyu-33 <[email protected]>

Signed-off-by: Hollow Man <[email protected]>

HollowMan6 · 2025-12-21T23:30:17Z

@yaoyu-33 I've opened PR #1788 that targets to bridge/peft_bridge_1 branch for fixing expert layers' case, feel free to merge that one or integrate it manually into this PR.

The convergence situation is good on dense models for RL on verl, with the gray one representing Canonical LoRA with bridge, blue one representing the normal LoRA with bridge, and yellow one representing the LoRA merge.

The convergence tests for MoE (qwen3-30b-a3b):

HollowMan6 · 2025-12-23T19:11:45Z

src/megatron/bridge/models/conversion/peft_bridge.py

+                if isinstance(adapter, ModuleDict):
+                    adapter_name = local_param_name.removeprefix(local_base_prefix + ".adapter.").split(".")[0]
+                    adapter = adapter[adapter_name]
+                input_is_parallel, _, _, _, base_linear_is_parallel = get_adapter_attributes_from_linear(to_wrap)


Note: This will need to be updated after #1800 is merged

Suggested change

input_is_parallel, _, _, _, base_linear_is_parallel = get_adapter_attributes_from_linear(to_wrap)

input_is_parallel, _, _, _, _, base_linear_is_parallel = get_adapter_attributes_from_linear(to_wrap)

HollowMan6 · 2025-12-23T19:17:40Z

tests/unit_tests/models/test_model_bridge_lora.py

+    assert weights[0].param_name.endswith(".linear_in.weight")
+    assert weights[1].param_name.endswith(".linear_out.weight")


To fix the test cases:

Suggested change

assert weights[0].param_name.endswith(".linear_in.weight")

assert weights[1].param_name.endswith(".linear_out.weight")

assert weights[0].param_name.endswith("lora_A.weight")

assert weights[1].param_name.endswith("lora_B.weight")

HollowMan6 · 2025-12-23T19:18:09Z

tests/unit_tests/models/test_model_bridge_lora.py

+        "materialize_adapter_weights",
+        lambda *_: [adapter_weight],
+    )
+


To fix the test cases:

Suggested change

# Provide a base HF weight name so stream_adapter_weights_megatron_to_hf can

# translate it into lora_A/lora_B names.

monkeypatch.setattr(

bridge,

"_get_base_hf_weight_names_for_adapter",

lambda *_args, **_kwargs: ["model.layers.0.mlp.linear_fc1.weight"],

)

HollowMan6 · 2025-12-23T19:19:39Z

tests/unit_tests/models/test_model_bridge_lora.py

+
+    weights = list(
+        bridge.stream_adapter_weights_megatron_to_hf(
+            [SimpleNamespace(config=SimpleNamespace())],


To fix the test cases:

Suggested change

[SimpleNamespace(config=SimpleNamespace())],

[SimpleNamespace(config=SimpleNamespace(num_moe_experts=0))],

HollowMan6 · 2025-12-23T19:19:51Z

tests/unit_tests/models/test_model_bridge_lora.py

+
+    weights = list(
+        bridge.stream_adapter_weights_megatron_to_hf(
+            [SimpleNamespace(config=SimpleNamespace())],


To fix the test cases:

Suggested change

[SimpleNamespace(config=SimpleNamespace())],

[SimpleNamespace(config=SimpleNamespace(num_moe_experts=0))],

Signed-off-by: Hollow Man <[email protected]>

…when EP > 1 (#1817) Signed-off-by: Hollow Man <[email protected]>

# Conflicts: # src/megatron/bridge/models/conversion/model_bridge.py # tests/unit_tests/models/test_model_bridge_lora.py

Signed-off-by: yaoyu-33 <[email protected]>

HollowMan6 · 2025-12-30T18:38:54Z

src/megatron/bridge/models/conversion/peft_bridge.py

+)
+from megatron.bridge.peft.canonical_lora import ModuleDict
+from megatron.bridge.peft.lora import LoRAMerge
+from megatron.bridge.peft.utils import get_adapter_attributes_from_linear, is_expert_linear


Suggested change

from megatron.bridge.peft.utils import get_adapter_attributes_from_linear, is_expert_linear

from megatron.bridge.peft.utils import ParallelLinearAdapter, get_adapter_attributes_from_linear, is_expert_linear

HollowMan6 · 2025-12-30T18:42:30Z

src/megatron/bridge/models/conversion/peft_bridge.py

+                if isinstance(adapter, ModuleDict):
+                    adapter_name = local_param_name.removeprefix(local_base_prefix + ".adapter.").split(".")[0]
+                    adapter = adapter[adapter_name]
+                input_is_parallel, _, _, _, _, base_linear_is_parallel = get_adapter_attributes_from_linear(to_wrap)


For ParallelLinearAdapter, base_linear_is_parallel can be different from the base layer (e.g. for linear_kv_down_proj).

Suggested change

input_is_parallel, _, _, _, _, base_linear_is_parallel = get_adapter_attributes_from_linear(to_wrap)

if isinstance(adapter, ParallelLinearAdapter):

input_is_parallel = adapter.input_is_parallel

base_linear_is_parallel = True

else:

input_is_parallel, _, _, _, _, base_linear_is_parallel = get_adapter_attributes_from_linear(to_wrap)

peft bridge initial commit

947d4ab

Signed-off-by: yaoyu-33 <[email protected]>

yaoyu-33 added 2 commits December 17, 2025 20:17

add stream_adapter_weights_megatron_to_hf api

af0561e

Signed-off-by: yaoyu-33 <[email protected]>

minor update

b23fa57

Signed-off-by: yaoyu-33 <[email protected]>

HollowMan6 reviewed Dec 18, 2025

View reviewed changes

src/megatron/bridge/models/conversion/peft_bridge.py Outdated Show resolved Hide resolved

HollowMan6 reviewed Dec 18, 2025

View reviewed changes

src/megatron/bridge/models/conversion/model_bridge.py Show resolved Hide resolved

Update src/megatron/bridge/models/conversion/model_bridge.py

d338738

Co-authored-by: ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 <[email protected]> Signed-off-by: Yu Yao <[email protected]>

yaoyu-33 added 6 commits December 18, 2025 16:53

fix import

f245ce3

Signed-off-by: yaoyu-33 <[email protected]>

Merge branch 'refs/heads/main' into bridge/peft_bridge_1

f26cdc3

lint

fde06eb

Signed-off-by: yaoyu-33 <[email protected]>

add fuse handling logic in adapter export

e16f243

Signed-off-by: yaoyu-33 <[email protected]>

test fix

075b591

Signed-off-by: yaoyu-33 <[email protected]>

update verification scripts

df55b23

Signed-off-by: yaoyu-33 <[email protected]>

HollowMan6 added a commit to HollowMan6/Megatron-Bridge that referenced this pull request Dec 21, 2025

[for NVIDIA-NeMo#1766] fix MoE expert layers LoRA bridge

0132c2b

Signed-off-by: Hollow Man <[email protected]>

HollowMan6 mentioned this pull request Dec 22, 2025

[megatron] feat: LoRA adapter only refit (TensorLoRARequest) volcengine/verl#4632

Draft

7 tasks

[for #1766] fix MoE expert layers LoRA bridge (#1788)

663f5bd

HollowMan6 reviewed Dec 23, 2025

View reviewed changes

HollowMan6 added a commit to HollowMan6/Megatron-Bridge that referenced this pull request Dec 29, 2025

[for NVIDIA-NeMo#1766] fix fc1 gate/up split with TP

8f8f703

Signed-off-by: Hollow Man <[email protected]>

HollowMan6 and others added 3 commits December 29, 2025 15:31

[for #1766] fix fc1 gate/up split with TP & fix expert layers export …

48507fd

…when EP > 1 (#1817) Signed-off-by: Hollow Man <[email protected]>

Merge branch 'main' into bridge/peft_bridge_1

68f760b

# Conflicts: # src/megatron/bridge/models/conversion/model_bridge.py # tests/unit_tests/models/test_model_bridge_lora.py

update some changes into peft bridge

157c409

Signed-off-by: yaoyu-33 <[email protected]>

HollowMan6 reviewed Dec 30, 2025

View reviewed changes

	input_is_parallel, _, _, _, base_linear_is_parallel = get_adapter_attributes_from_linear(to_wrap)
	input_is_parallel, _, _, _, _, base_linear_is_parallel = get_adapter_attributes_from_linear(to_wrap)

		assert weights[0].param_name.endswith(".linear_in.weight")
		assert weights[1].param_name.endswith(".linear_out.weight")

+    # Provide a base HF weight name so stream_adapter_weights_megatron_to_hf can
+    # translate it into lora_A/lora_B names.
+    monkeypatch.setattr(
+        bridge,
+        "_get_base_hf_weight_names_for_adapter",
+        lambda *_args, **_kwargs: ["model.layers.0.mlp.linear_fc1.weight"],
+    )

	[SimpleNamespace(config=SimpleNamespace())],
	[SimpleNamespace(config=SimpleNamespace(num_moe_experts=0))],

	from megatron.bridge.peft.utils import get_adapter_attributes_from_linear, is_expert_linear
	from megatron.bridge.peft.utils import ParallelLinearAdapter, get_adapter_attributes_from_linear, is_expert_linear

-                input_is_parallel, _, _, _, _, base_linear_is_parallel = get_adapter_attributes_from_linear(to_wrap)
+                if isinstance(adapter, ParallelLinearAdapter):
+                    input_is_parallel = adapter.input_is_parallel
+                    base_linear_is_parallel = True
+                else:
+                    input_is_parallel, _, _, _, _, base_linear_is_parallel = get_adapter_attributes_from_linear(to_wrap)

[Draft] Peft Bridge #1766

Are you sure you want to change the base?

[Draft] Peft Bridge #1766

Uh oh!

Conversation

yaoyu-33 commented Dec 18, 2025

What does this PR do ?

Changelog

GitHub Actions CI

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Dec 18, 2025

Uh oh!

Uh oh!

Uh oh!

HollowMan6 commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yaoyu-33 commented Dec 19, 2025

Uh oh!

HollowMan6 commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HollowMan6 Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HollowMan6 Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

HollowMan6 Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

HollowMan6 Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

HollowMan6 Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

HollowMan6 Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

HollowMan6 Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HollowMan6 commented Dec 18, 2025 •

edited

Loading

HollowMan6 commented Dec 21, 2025 •

edited

Loading

HollowMan6 Dec 23, 2025 •

edited

Loading