hparams/update bootstrap version #645

joellidin · 2025-11-03T13:59:47Z

(hparams) Update bootstrap version to 2.1.12
Bump run version

Description

Related Issue(s)

Closes #[issue number]

Type of Change

Feature (adding new functionality)
Fix (resolving a bug or issue)
Docs (documentation updates)
Refactor (code changes that don't affect functionality)
Maintenance (dependency updates or other maintenance)
Tests (adding or improving tests)
Breaking change (fix or feature with incompatible API changes)
Other: _____

Branch Naming

My branch follows the project's naming convention (e.g., feature/add-new-capability)

Commit Messages

My commits are small, atomic, and have proper commit messages
Commit messages are in imperative mood with a capitalized summary under 50 chars

Code Quality

I've performed a self-review of my code
I've added appropriate docstrings following the project's conventions
I've added proper logging where necessary (without trailing periods)
I've applied linting and formatting with Ruff
My code generates no new warnings

Testing

I've added tests for new functionality or bug fixes
All tests pass locally with my changes
Test coverage has not decreased

Documentation

I've updated documentation to reflect my changes
I've updated comments in hard-to-understand areas

If this is a breaking change

Screenshots/Examples

Additional Notes

Summary by CodeRabbit

Chores
- Released version 2.1.13.
- Updated default training hyperparameters (outer learning rate reduced) and checkpoint initialization parameters.
- Adjusted optimizer scheduler behavior to enable a flattening phase earlier and for a limited duration.

coderabbitai · 2025-11-03T13:59:57Z

Walkthrough

Updated hyperparameters in hparams/hparams.json (outer_learning_rate, checkpoint_init_version/window, and adamw scheduler flattening settings). Bumped package version in src/tplr/__init__.py from "2.1.12" to "2.1.13".

Changes

Cohort / File(s)	Summary
Hyperparameter Configuration `hparams/hparams.json`	`outer_learning_rate` changed 0.7 → 0.6; `checkpoint_init_version` changed `"2.1.9"` → `"2.1.12"`; `checkpoint_init_window` changed `58181` → `59057`; in `optimizer.scheduler.adamw`: `flatten_start_step` set to `2650`, `flatten_duration` set to `1000`.
Package Version `src/tplr/__init__.py`	Bumped `__version__` from `"2.1.12"` → `"2.1.13"`.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Trainer as Trainer
    participant HP as hparams/hparams.json
    participant Scheduler as AdamW Scheduler
    note right of HP #D3E4CD: Config values\n(outer_learning_rate,\ncheckpoint_init_*, flatten params)
    Trainer->>HP: load hyperparameters
    Trainer->>Scheduler: init with hyperparams
    Scheduler-->>Trainer: scheduler ready
    alt training step < flatten_start_step
        Trainer->>Scheduler: step(n) [normal lr schedule]
    else training step between flatten range
        Trainer->>Scheduler: step(n) [apply flattening: duration=1000]
    else after flatten range
        Trainer->>Scheduler: step(n) [resume normal schedule]
    end
    Note over Scheduler: checkpoint_init_version/window are\nused during checkpoint init logic (external to scheduler)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Pay attention to runtime impact of changing flatten_start_step/flatten_duration and outer_learning_rate.
Confirm checkpoint_init_version/checkpoint_init_window align with checkpoint loading code expectations.

Possibly related PRs

hparams/update bootstrap version #623: Modifies the same hparams/hparams.json fields and package version bump — likely coordinated updates.
hparams/update lr #611: Updates checkpoint init fields and increments package version in the same pattern.
hparams/update bootstrap version #614: Changes checkpoint initialization parameters and package version similarly.

Poem

🐰 A hop, a tweak, a learning rate low,
Versions nudged forward, checkpoints set to go.
Flattening starts where steps align,
Small changes stitched in tidy design —
I nibble bugs and bounce, all aglow. ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The pull request description is largely incomplete. While it provides two bullet points summarizing the changes, the author has not filled in the essential template sections such as a detailed description of what was changed and why, related issues, type of change classification, or any verification of checklist items. All checkboxes remain unchecked, and most template sections contain only placeholder comments with no actual content provided by the author.	Complete the pull request description by adding: (1) a detailed description explaining what changes were made and why they are necessary; (2) classification under 'Type of Change' (appears to be 'Maintenance'); (3) verification of applicable checklist items like 'Branch Naming', 'Commit Messages', and 'Code Quality'; (4) confirmation that tests pass and documentation is updated if relevant. At minimum, provide substantive content for the Description and Type of Change sections rather than leaving them blank.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The pull request title 'hparams/update bootstrap version' clearly and concisely summarizes the primary changes in the changeset. It indicates that the main focus is updating the bootstrap version in the hyperparameters file, which aligns with the actual changes (checkpoint_init_version bump from 2.1.9 to 2.1.12) and the package version bump to 2.1.13. The title is specific enough to convey meaningful information about the changeset.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch hparams/update-bootstrap-version

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2025-11-03T14:02:15Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

@@           Coverage Diff           @@
##              dev     #645   +/-   ##
=======================================
  Coverage   57.91%   57.91%           
=======================================
  Files          27       27           
  Lines        4890     4890           
=======================================
  Hits         2832     2832           
  Misses       2058     2058

Files with missing lines	Coverage Δ
src/tplr/__init__.py	`100.00% <100.00%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Update checkpoint initialization to version 2.1.12 and window 59057 to bootstrap from the latest stable checkpoint. Configure LR flattening to start at step 2650 with a 1000-step duration for improved training stability. - Update checkpoint_init_version from 2.1.9 to 2.1.12 - Update checkpoint_init_window from 58181 to 59057 - Set flatten_start_step to 2650 - Set flatten_duration to 1000

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

hparams/hparams.json (1)

1-114: PR description incomplete—does not capture all hyperparameter changes.

The PR objectives state only "Update bootstrap version to 2.1.12 (hparams). Bump run version." but the actual changes include four additional hyperparameter modifications:

outer_learning_rate reduction

checkpoint_init_window advancement

Two new adamw scheduler flattening parameters

This discrepancy raises concerns about whether all changes were intentional or if the PR description was incomplete. Update the PR description to fully document the rationale for each change, particularly the learning rate adjustment and scheduler tuning.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f4b8c63 and e7b6d97.

📒 Files selected for processing (2)

hparams/hparams.json (3 hunks)
src/tplr/__init__.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/tplr/init.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: test (3.12)
GitHub Check: test (3.11)

🔇 Additional comments (2)

hparams/hparams.json (2)

46-47: Verify checkpoint initialization version and window alignment.

The checkpoint_init_version update from "2.1.9" to "2.1.12" aligns with the PR objective, but the simultaneous advancement of checkpoint_init_window from 58181 to 59057 is not documented. Confirm:

Whether this window advance is a necessary migration step alongside the version bump

Whether existing checkpoints at version 2.1.12 and window 59057 exist and are compatible with the current model state

66-67: Verify scheduler flattening parameters are tested and appropriate.

The adamw scheduler now includes flatten_start_step: 2650 and flatten_duration: 1000. These are new parameters that fundamentally alter optimizer behavior by flattening the learning rate schedule starting at step 2650 for 1000 steps. Confirm:

These values have been tuned/validated for this model configuration

The flattening behavior is appropriate for training stability and convergence

muon scheduler remains unaffected (flatten_start_step: null, flatten_duration: 0) for a reason

coderabbitai · 2025-11-03T14:13:33Z

hparams/hparams.json

    "batch_size": 192,
    "inner_steps": 30,
-    "outer_learning_rate": 0.7,
+    "outer_learning_rate": 0.6,


🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Undocumented learning rate reduction—verify this is intentional.

The outer learning rate has been reduced by ~14% (0.7 → 0.6) but is not mentioned in the PR description. This is a significant hyperparameter change that affects training dynamics. Confirm this change was intentional and has been validated.

🏁 Script executed:

#!/bin/bash # Check the current state and git diff for hparams/hparams.json echo "=== Current hparams/hparams.json (lines 8-12) ===" sed -n '8,12p' hparams/hparams.json echo -e "\n=== Full git diff for hparams/hparams.json ===" git diff HEAD^ hparams/hparams.json 2>/dev/null || git diff hparams/hparams.json 2>/dev/null || echo "Git diff unavailable" echo -e "\n=== Commit message ===" git log -1 --pretty=format:"%B" 2>/dev/null || echo "Commit message unavailable" echo -e "\n=== Search for learning rate mentions in documentation ===" find . -type f $ -name "*.md" -o -name "*.txt" -o -name "README*" $ -exec grep -l "learning.rate\|outer_learning_rate" {} \; 2>/dev/null | head -5

Length of output: 378

🏁 Script executed:

#!/bin/bash # Search for any documentation about hyperparameter changes or tuning rationale echo "=== Search for learning rate documentation in codebase ===" rg -i "outer_learning_rate|learning.rate.*0\.6|learning.rate.*0\.7" -A 2 -B 2 --type=md --type=txt 2>/dev/null | head -30 echo -e "\n=== Check if there's a CHANGELOG or similar ===" find . -type f $ -name "CHANGELOG*" -o -name "HISTORY*" -o -name "CHANGES*" $ -exec head -50 {} \;

Length of output: 390

🏁 Script executed:

#!/bin/bash # Check git history for outer_learning_rate value changes echo "=== Git log for hparams/hparams.json (last 10 commits) ===" git log --oneline -10 -- hparams/hparams.json 2>/dev/null || echo "No history available" echo -e "\n=== Check outer_learning_rate value in previous commits ===" git show HEAD~1:hparams/hparams.json 2>/dev/null | sed -n '8,12p' || echo "Previous version unavailable" echo -e "\n=== Check if there are any staged or unstaged changes ===" git status hparams/hparams.json 2>/dev/null || echo "Status unavailable" echo -e "\n=== Show diff with specific context for lines around outer_learning_rate ===" git diff --no-index /dev/null hparams/hparams.json 2>/dev/null | grep -A 3 -B 3 "outer_learning_rate" | head -20

Length of output: 1181

🏁 Script executed:

#!/bin/bash # Find the merge base and compare echo "=== Current commit info ===" git log -1 --oneline HEAD echo -e "\n=== Find merge base with main/master ===" git merge-base HEAD origin/main 2>/dev/null || git merge-base HEAD origin/master 2>/dev/null || echo "Merge base not found" echo -e "\n=== Diff between merge base and current HEAD for hparams.json ===" MERGE_BASE=$(git merge-base HEAD origin/main 2>/dev/null || git merge-base HEAD origin/master 2>/dev/null) if [ ! -z "$MERGE_BASE" ]; then git diff $MERGE_BASE HEAD -- hparams/hparams.json | head -100 else echo "Cannot determine merge base" fi echo -e "\n=== Check all changes in current PR ===" git log --oneline $(git merge-base HEAD origin/main 2>/dev/null || git merge-base HEAD origin/master 2>/dev/null)..HEAD 2>/dev/null | head -20

Length of output: 2913

Learning rate reduction is intentional but undocumented—update commit message or PR description to explain all hyperparameter changes.

The reduction from 0.7 to 0.6 is confirmed and is part of an intentional feature (feat/plateu lr) with corresponding test coverage and scheduler updates (new flatten_start_step and flatten_duration parameters). However, the commit message "Bump run version" documents none of these changes. Update the commit message or PR description to explicitly list all hyperparameter modifications and their rationale, including the learning rate reduction, new scheduler flattening behavior, checkpoint version bump, and BMA threshold changes.

🤖 Prompt for AI Agents

In hparams/hparams.json around line 10, the commit message "Bump run version" fails to document intentional hyperparameter changes; update the commit message or the PR description to explicitly list each change and rationale: note the outer_learning_rate reduction from 0.7 to 0.6, the addition of the scheduler parameters flatten_start_step and flatten_duration with brief explanation of their plateau/flattening behavior, the checkpoint version bump, and the BMA threshold adjustments; keep the entry concise, reference affected files/lines, and include why each change was made and any impact on tests or runtime.

joellidin added 2 commits November 3, 2025 18:10

Bump run version

e7b6d97

joellidin force-pushed the hparams/update-bootstrap-version branch from f4b8c63 to e7b6d97 Compare November 3, 2025 14:10

joellidin merged commit 06aff26 into dev Nov 3, 2025
5 of 6 checks passed

joellidin deleted the hparams/update-bootstrap-version branch November 3, 2025 14:12

coderabbitai bot reviewed Nov 3, 2025

View reviewed changes

This was referenced Nov 4, 2025

hparams/update bootstrap version #647

Merged

v2.1.14 #648

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

hparams/update bootstrap version #645

hparams/update bootstrap version #645

Uh oh!

joellidin commented Nov 3, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 3, 2025 •

edited

Loading

Uh oh!

codecov bot commented Nov 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hparams/update bootstrap version #645

hparams/update bootstrap version #645

Uh oh!

Conversation

joellidin commented Nov 3, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue(s)

Type of Change

Branch Naming

Commit Messages

Code Quality

Testing

Documentation

If this is a breaking change

Screenshots/Examples

Additional Notes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

codecov bot commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joellidin commented Nov 3, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 3, 2025 •

edited

Loading

codecov bot commented Nov 3, 2025 •

edited

Loading