[moe] Add residual bottleneck variant for 10T gate experiments by claude[bot] · Pull Request #4061 · marin-community/marin

claude · 2026-03-23T21:57:44Z

Create experiments/grug/moe_resid_bottleneck/ variant from MoE base, adding per-layer learnable residual scaling (init 1.0, applied before each block) and per-head zero-init sigmoid attention gates (gate = 2 * sigmoid(W @ x[:12])). Both features are config-toggled via use_residual_lambdas and use_attention_gates flags. Existing grug variant contract tests auto-discover and validate the new variant.

Fixes #4035

…tion gates Create experiments/grug/moe_resid_bottleneck/ variant from MoE base adding per-layer learnable residual scaling (init 1.0) and per-head zero-init sigmoid attention gates. Both features are config-toggled. For the great 10T gate residual bottleneck experiments. Fixes #4035 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-03-23T22:22:46Z

🤖 Grug variant diff report

New Variant	Closest Existing Variant	Distance Score	Diff
`moe_resid_bottleneck`	`moe`	123	Open

Artifact fallback: Download report bundle

github-actions · 2026-04-16T01:53:13Z

This pull request has been inactive for 23 days and is marked as stale.
If there is no further activity within 7 days, it will be automatically closed.
If you believe this PR should remain open, please add a comment or update the PR.

claude Bot added the agent-generated Created by automation/agent label Mar 23, 2026

claude Bot mentioned this pull request Mar 23, 2026

[moe] Great 10T: residual bottleneck experiments #4035

Open

This was referenced Mar 30, 2026

Modeling April 2026 #4266

Closed

MoE Scaling up to April goal #4281

Open

github-actions Bot added the stale label Apr 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[moe] Add residual bottleneck variant for 10T gate experiments#4061

[moe] Add residual bottleneck variant for 10T gate experiments#4061
claude[bot] wants to merge 1 commit intomainfrom
agent/20260323-fix-4035

claude Bot commented Mar 23, 2026

Uh oh!

github-actions Bot commented Mar 23, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

claude Bot commented Mar 23, 2026

Uh oh!

github-actions Bot commented Mar 23, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants