Olmoearth dynamic token skipping by Hgherzog · Pull Request #511 · allenai/olmoearth_pretrain

Hgherzog · 2026-03-13T23:03:01Z

Implement NOBLE (Nonlinear Low-Rank Branches) to accelerate transformer pretraining, based on arXiv:2603.06492.

Implements NOBLE from arXiv:2603.06492 which augments transformer linear layers with nonlinear low-rank branches that can accelerate training by up to 1.47x. Key changes: - Add noble.py with CosNet activation, NobleBranch, NobleLinear, and NobleConfig - Update attention.py to support optional NOBLE branches on Q/K/V/proj and MLP layers - Update flexi_vit.py to pass noble_config through FlexiVitBase, Encoder, Predictor - Add EncoderConfig.noble_config and PredictorConfig.noble_config - Add scripts/official/noble.py experiment script for NOBLE training - Add comprehensive unit tests in tests/unit/nn/test_noble.py NOBLE computes: output = xW + σ(xW_down)W_up where σ is CosNet (two-layer cosine nonlinearity with learnable frequency/phase). Paper reports: - Up to 1.47x step speedup to reach baseline eval loss - ~4% additional parameters (at scale) with 7% step time overhead - Up to 1.22x net wallclock speedup Note: Paper found Mixup/CutMix interferes with NOBLE benefits. Co-authored-by: henryh <henryh@allenai.org>

Co-authored-by: henryh <henryh@allenai.org>

cursor · 2026-03-13T23:03:02Z

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
_{Learn more about Cursor Agents}

cursoragent and others added 2 commits March 13, 2026 23:00

Fix linting issues: add docstring, fix type ignores, format code

eb8f8e3

Co-authored-by: henryh <henryh@allenai.org>

github-actions bot added the size/l label Mar 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Olmoearth dynamic token skipping#511

Olmoearth dynamic token skipping#511
Hgherzog wants to merge 2 commits intomainfrom
cursor/olmoearth-dynamic-token-skipping-0ec9

Hgherzog commented Mar 13, 2026

Uh oh!

cursor bot commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Hgherzog commented Mar 13, 2026

Uh oh!

cursor bot commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants