mirror: feat: Add EXAONE 4.0 model bridge (LG AI Research) by ko3n1g · Pull Request #4298 · NVIDIA-NeMo/Megatron-Bridge

ko3n1g · 2026-06-11T06:36:22Z

Claude summary

Mirror of #2532 by @Bias92 — copied into the upstream repo so the full CI pipeline runs natively (cross-fork PRs cannot trigger it).

Original PR: feat: Add EXAONE 4.0 model bridge (LG AI Research) #2532
Author: @Bias92
Source: Bias92/Megatron-Bridge:feat/exaone4-bridge

Commits are copied verbatim with authorship preserved. Review and discussion remain on #2532.

Signed-off-by: Bias92 <pewpewplay315@gmail.com>

Signed-off-by: 김재우 <pewpewplay315@gmail.com>

Move duplicated TERowParallelLinearLayerNorm class into models/common/te_layers.py and update Gemma2, Gemma3, and EXAONE imports. No behavior change on the normal no-bias path; adds a defensive assertion for deferred bias. Signed-off-by: Bias92 <pewpewplay315@gmail.com>

Signed-off-by: Bias92 <pewpewplay315@gmail.com>

Drop two fields from Exaone4ModelProvider that duplicate parent defaults: - share_embeddings_and_output_weights (parent: True) - rotary_percent (parent: 1.0) Per reviewer feedback on PR #2532. Signed-off-by: Bias92 <pewpewplay315@gmail.com>

Signed-off-by: 노란토끼 <83907395+Bias92@users.noreply.github.com>

Signed-off-by: Bias92 <pewpewplay315@gmail.com>

Drop the EXAONE-specific GPTModelProvider subclasses and configure the plain GPTModelProvider returned by the base bridge instead. This follows the Qwen3 provider_bridge pattern and keeps only the custom EXAONE layer spec in the provider module. Signed-off-by: Bias92 <pewpewplay315@gmail.com>

Signed-off-by: adityavavreNVDA <avavre@nvidia.com>

Signed-off-by: Chen Cui <chcui@nvidia.com>

copy-pr-bot · 2026-06-11T06:36:25Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

claude · 2026-06-11T06:38:19Z

    "MiniMaxM2Bridge",
    "OlMoEBridge",
    "OlMoEModelProvider",
+    "Exaone4Bridge",


Bug: Exaone4Bridge is listed twice in __all__ — once at line 197 (correct alphabetical position) and again here. The duplicate should be removed.

Suggested change

"Exaone4Bridge",

claude · 2026-06-11T06:38:22Z

@@ -0,0 +1,4 @@
+from megatron.bridge.models.exaone.exaone4_bridge import Exaone4Bridge


Missing NVIDIA copyright header. Per project rules, all new Python files under src/ must include the Apache 2.0 copyright header (tests are exempt).

Suggested change

from megatron.bridge.models.exaone.exaone4_bridge import Exaone4Bridge

# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

#

# Licensed under the Apache License, Version 2.0 (the "License");

# you may not use this file except in compliance with the License.

# You may obtain a copy of the License at

#

# http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.

from megatron.bridge.models.exaone.exaone4_bridge import Exaone4Bridge

claude · 2026-06-11T06:38:26Z

+            output_size,
+            config=config,
+            **kwargs,
+        )
+        self.post_layernorm = TENorm(config, output_size)


Nit: This assert is new — the original Gemma2/Gemma3 implementations did not have it. While the check is correct in practice (all current callers set add_bias_linear=False), note that assert is stripped by python -O. If this is meant as a safety invariant for future models adopting this class, consider using if bias is not None: raise ValueError(...) instead.

claude · 2026-06-11T06:41:08Z

Review - EXAONE 4.0 Bridge -- Clean new-model bridge following the established LLM pattern (Qwen2/Gemma2). The TERowParallelLinearLayerNorm refactor into models/common/te_layers.py is a nice dedup. Items: (1) Bug: Duplicate Exaone4Bridge in all (src/megatron/bridge/models/init.py): appears at line 197 (correct alphabetical spot) and again at line 226 (out of order). The second entry should be removed. (2) Bug: Missing copyright header (src/megatron/bridge/models/exaone/init.py): New source files under src/ require the NVIDIA Apache 2.0 header. (3) Observation: The refactored TERowParallelLinearLayerNorm adds a new assert bias is None guard absent from the original Gemma2/Gemma3 implementations. Safe in practice, but assert is stripped by python -O. Consider raise ValueError. (4) Missing: No functional conversion roundtrip test (test_exaone4_conversion.py): toy-model HF-to-Megatron roundtrip GPU test missing. (5) Missing: No unit test for shared te_layers.py module since it is now shared by Gemma2, Gemma3, and EXAONE. -- Suggested test cases: No perf tests impacted.

ko3n1g · 2026-06-11T06:46:15Z

/ok to test 97ab5b2

Bias92 and others added 14 commits February 26, 2026 04:17

feat: Add EXAONE 4.0 model bridge for HF-Megatron conversion

fc798ba

Signed-off-by: Bias92 <pewpewplay315@gmail.com>

fix: Address review comments on EXAONE 4.0 bridge

f64cc70

Signed-off-by: 김재우 <pewpewplay315@gmail.com>

Merge branch 'main' into feat/exaone4-bridge

ee52a73

Merge branch 'main' into feat/exaone4-bridge

b677ed2

docs: add EXAONE 4.0 example scripts

6e29bac

Signed-off-by: Bias92 <pewpewplay315@gmail.com>

test: add EXAONE 4.0 bridge coverage

ff5a512

Signed-off-by: Bias92 <pewpewplay315@gmail.com>

style: clean up EXAONE review follow-up

2394d35

Signed-off-by: Bias92 <pewpewplay315@gmail.com>

Merge branch 'main' into feat/exaone4-bridge

319a6eb

Signed-off-by: 노란토끼 <83907395+Bias92@users.noreply.github.com>

style: fix EXAONE review cleanup whitespace

d5823a4

Signed-off-by: Bias92 <pewpewplay315@gmail.com>

Fixing lint issues

c496ff4

Signed-off-by: adityavavreNVDA <avavre@nvidia.com>

Merge branch 'main' into feat/exaone4-bridge

97ab5b2

Signed-off-by: Chen Cui <chcui@nvidia.com>

claude Bot reviewed Jun 11, 2026

View reviewed changes

copy-pr-bot Bot temporarily deployed to public June 11, 2026 06:46 Inactive

copy-pr-bot Bot temporarily deployed to test June 11, 2026 06:47 Inactive

copy-pr-bot Bot temporarily deployed to public June 11, 2026 06:55 Inactive

copy-pr-bot Bot temporarily deployed to public June 11, 2026 06:56 Inactive

yaoyu-33 added area:model Model implementations and HF bridge logic community-request feature New capabilities, enhancements, or enablement work needs-review PR is ready for code review and waiting on a reviewer labels Jun 11, 2026

copy-pr-bot Bot temporarily deployed to public June 11, 2026 07:17 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mirror: feat: Add EXAONE 4.0 model bridge (LG AI Research)#4298

mirror: feat: Add EXAONE 4.0 model bridge (LG AI Research)#4298
ko3n1g wants to merge 14 commits into
mainfrom
ko3n1g/mirror/pr-2532

ko3n1g commented Jun 11, 2026

Uh oh!

copy-pr-bot Bot commented Jun 11, 2026

Uh oh!

claude Bot Jun 11, 2026

Uh oh!

claude Bot Jun 11, 2026

Uh oh!

claude Bot Jun 11, 2026

Uh oh!

claude Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

ko3n1g commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		@@ -0,0 +1,4 @@
		from megatron.bridge.models.exaone.exaone4_bridge import Exaone4Bridge

-from megatron.bridge.models.exaone.exaone4_bridge import Exaone4Bridge
+# Copyright (c) 2025, NVIDIA CORPORATION.  All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from megatron.bridge.models.exaone.exaone4_bridge import Exaone4Bridge

Conversation

ko3n1g commented Jun 11, 2026

Uh oh!

copy-pr-bot Bot commented Jun 11, 2026

Uh oh!

claude Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ko3n1g commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

claude Bot commented Jun 11, 2026 •

edited

Loading