Skip to content

Conversation

@joellidin
Copy link
Collaborator

  • (validator) Tighten sync score penalty curve
  • (validator) Defer penalty for negative gradient
  • (tests) Add 5th-ranked penalty skip logic tests
  • (neurons) Remove duplicate LR scaling in gradient
  • (neurons) Add inner LR flatten window feature
  • (tests) Add LR flatten support to mock fixtures
  • (tests) Add comprehensive LR flattening tests
  • (tests) Fix tests after LR scaling removal
  • (comms) Add timeouts to large file downloads
  • (validator) Add BMA threshold with warmup period
  • (hparams) Update bootstrap version to 2.1.12
  • Bump run version

Description

Related Issue(s)

  • Closes #[issue number]

Type of Change

  • Feature (adding new functionality)
  • Fix (resolving a bug or issue)
  • Docs (documentation updates)
  • Refactor (code changes that don't affect functionality)
  • Maintenance (dependency updates or other maintenance)
  • Tests (adding or improving tests)
  • Breaking change (fix or feature with incompatible API changes)
  • Other: _____

Branch Naming

  • My branch follows the project's naming convention (e.g., feature/add-new-capability)

Commit Messages

  • My commits are small, atomic, and have proper commit messages
  • Commit messages are in imperative mood with a capitalized summary under 50 chars

Code Quality

  • I've performed a self-review of my code
  • I've added appropriate docstrings following the project's conventions
  • I've added proper logging where necessary (without trailing periods)
  • I've applied linting and formatting with Ruff
  • My code generates no new warnings

Testing

  • I've added tests for new functionality or bug fixes
  • All tests pass locally with my changes
  • Test coverage has not decreased

Documentation

  • I've updated documentation to reflect my changes
  • I've updated comments in hard-to-understand areas

If this is a breaking change

Screenshots/Examples

Additional Notes

Reduce the sync score tolerance from 5 steps to 3 steps behind, creating
a steeper penalty curve that encourages better miner synchronization
across the network.

- Update sync_score formula cap from 5.0 to 3.0
- Adjust sync_max_steps_behind threshold from 3 to 2 in hparams
- Update formula comment to reflect new calculation
Refactor negative evaluation penalty logic to apply slashing and
exclusion AFTER all evaluations complete, rather than inline during the
evaluation loop. This ensures consistent treatment based on the full
window of evaluated UIDs.

Add should_skip_negative_penalty() to skip penalties when the 5th-ranked
UID in the current window has negative gradient score, indicating
overall poor performance across the network.

- Add should_skip_negative_penalty() method to check 5th-ranked UID

- Refactor track_negative_evaluation() to only track history and
  consecutive counts, removing inline penalty application

- Add apply_negative_evaluation_penalties() to apply all penalties after
  evaluations complete with consistent skip logic

- Update main evaluation loop to call penalty application after all
  evaluations finish

Previously, penalties were applied as each UID was evaluated, causing
inconsistent behavior where early UIDs saw incomplete window data. Now
penalties are applied consistently when the full picture of gradient
scores is available.
Add comprehensive test coverage for new 5th-ranked UID penalty skipping
logic. Tests verify that slashing and exclusion penalties are correctly
skipped when the 5th-ranked UID has a negative score (indicating overall
poor network performance).

- Add 12 new test cases for should_skip_negative_penalty logic
- Test various scenarios: <5 UIDs, exactly 5, >5 UIDs
- Test edge cases: no window scores, all positive scores
- Update existing test to use two-step penalty architecture
- Verify slashing skipped when 5th-ranked is negative
- Verify exclusion skipped when 5th-ranked is negative
- All 21 tests pass
Remove learning rate scaling from error_feedback.add_() operation to
prevent applying LR twice. The learning rate is already applied in the
outer step, so scaling grad_full again here was incorrect.

- Remove unused lr variable from prepare_gradient_dict
- Change error_feedback.add_(grad_full, alpha=lr) to use default
  alpha=1.0 by removing the alpha parameter
Add ability to freeze inner learning rate at its current value for a
specified number of outer steps (windows). The flatten window is
configured in hparams using flatten_start_step and flatten_duration
(both in outer steps/windows).

- Add flatten_start_step and flatten_duration to hparams config
- Add should_skip_scheduler_step() method to check flatten window
- Skip scheduler.step() during flatten window (LR stays constant)
- Track inner_scheduler_step_count to maintain position
- Apply flatten logic in all scheduler step locations:
  - Main training loop
  - Window catch-up loop
  - Validator gather loop (simulates miner inner loop)
  - Initial checkpoint catch-up replay
  - Per-window catch-up replay

Flatten window correctly handles catch-up scenarios by respecting the
window during scheduler replay.
Update test fixtures to include the new inner_scheduler_step_count
attribute and should_skip_scheduler_step method required by the LR
flattening feature.

- Add inner_scheduler_step_count = 0 to mock_instance fixture
- Add should_skip_scheduler_step mock returning False by default
- Add inner_scheduler_step_count to test validator creation
- Ensure checkpoint save/load tests work with new state fields
Add 12 test cases covering all aspects of the LR flattening feature
introduced in previous commits.

Basic functionality tests (7):
- Test disabled states (None, zero duration)
- Test window boundaries (before, during, after)
- Test optimizer compatibility (AdamW, Muon)
- Test outer-to-inner step conversion accuracy

Scheduler behavior tests (5):
- Verify scheduler.step() not called during flatten
- Verify scheduler.step() called before/after flatten
- Test partial window overlaps at flatten boundary
- Test step count persistence across windows

All tests validate the feature works correctly with the existing trainer
infrastructure.
Update two test cases in test_prepare_gradient_dict.py to reflect the
removal of duplicate LR scaling in commit 5a74fa3
Prevent indefinite hangs during large file downloads by adding
asyncio.wait_for timeouts at three critical levels:

- Overall download_large_file call uses configurable timeout parameter
  from function argument
- S3 get_object requests have 15 second timeout
- Individual stream reads have 15 second timeout

These timeouts ensure the download process will fail gracefully rather
than hanging indefinitely when network issues occur.
Add binary moving average threshold to filter low-performing peers from
final score calculations. Peers with BMA below the threshold receive a
final score of 0.

- Extract binary_moving_average into bma variable
- Apply threshold only after warmup period completes
- Track warmup using windows_since_start calculation
- Add configurable hparams: bma_threshold (0.10) and bma_warmup_windows
  (10)

The warmup period allows new validator runs to stabilize before applying
the threshold, preventing premature peer filtering.
Update checkpoint initialization to version 2.1.12 and window 59057 to
bootstrap from the latest stable checkpoint. Configure LR flattening to
start at step 2650 with a 1000-step duration for improved training
stability.

- Update checkpoint_init_version from 2.1.9 to 2.1.12
- Update checkpoint_init_window from 58181 to 59057
- Set flatten_start_step to 2650
- Set flatten_duration to 1000
@coderabbitai
Copy link

coderabbitai bot commented Nov 3, 2025

Warning

Rate limit exceeded

@joellidin has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 7 minutes and 40 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 7641300 and 06aff26.

📒 Files selected for processing (11)
  • hparams/hparams.json (5 hunks)
  • neurons/trainer.py (5 hunks)
  • neurons/validator.py (8 hunks)
  • src/tplr/__init__.py (1 hunks)
  • src/tplr/comms.py (3 hunks)
  • src/tplr/neurons.py (3 hunks)
  • tests/test_checkpoint_fallback.py (2 hunks)
  • tests/test_prepare_gradient_dict.py (3 hunks)
  • tests/test_state_loading.py (1 hunks)
  • tests/unit/test_lr_flattening.py (1 hunks)
  • tests/unit/test_slashing.py (2 hunks)
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Nov 3, 2025

Codecov Report

❌ Patch coverage is 75.00000% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/tplr/comms.py 50.00% 2 Missing ⚠️
src/tplr/neurons.py 85.71% 1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (75.00%) is below the target coverage (85.00%). You can increase the patch coverage or adjust the target coverage.
❌ Your project status has failed because the head coverage (57.91%) is below the target coverage (85.00%). You can increase the head coverage or adjust the target coverage.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #646      +/-   ##
==========================================
- Coverage   57.92%   57.91%   -0.01%     
==========================================
  Files          27       27              
  Lines        4886     4890       +4     
==========================================
+ Hits         2830     2832       +2     
- Misses       2056     2058       +2     
Files with missing lines Coverage Δ
src/tplr/__init__.py 100.00% <100.00%> (ø)
src/tplr/neurons.py 77.20% <85.71%> (-0.06%) ⬇️
src/tplr/comms.py 65.08% <50.00%> (-0.06%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@joellidin joellidin merged commit eb311a8 into main Nov 3, 2025
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants