Skip to content

Rank max learning rate#505

Draft
Hgherzog wants to merge 5 commits intomainfrom
cursor/rank-max-learning-rate-9827
Draft

Rank max learning rate#505
Hgherzog wants to merge 5 commits intomainfrom
cursor/rank-max-learning-rate-9827

Conversation

@Hgherzog
Copy link
Collaborator

Add rank_max_lr to enable per-rank learning rates for linear probe evaluation, providing a free learning rate sweep during in-loop evaluations.


Open in Web Open in Cursor 

- Each rank uses a different LR from a log-spaced range (2 orders of magnitude)
- After evaluation, all_reduce MAX to get best score across ranks
- Auto-disables if world_size < 2 or > 20
- New flag: DownstreamTaskConfig.rank_max_lr (default: False)

Co-authored-by: henryh <henryh@allenai.org>
@cursor
Copy link

cursor bot commented Feb 25, 2026

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

cursoragent and others added 2 commits February 25, 2026 20:01
- LRs: [1e-4, 5e-4, 1e-3, 5e-3, 1e-2, 5e-2, 1e-1, 5e-1]
- Auto-disable if world_size != 8 (must match number of sweep LRs)

Co-authored-by: henryh <henryh@allenai.org>
Co-authored-by: henryh <henryh@allenai.org>
RANK_MAX_LRS = [1e-4, 5e-4, 1e-3, 5e-3, 1e-2, 5e-2, 1e-1, 5e-1]


def get_rank_lr(rank: int, world_size: int) -> float | None:
Copy link
Collaborator

@gabrieltseng gabrieltseng Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

world_size is not necessary for this function

select_final_test_miou_based_on_epoch_of_max_val_miou: bool = False,
n_bootstrap: int = 0,
bootstrap_seed: int = 42,
rank_max_lr: bool = False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i feel like this should be true by default?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants