Skip to content

Commit c497bbf

Browse files
Update records/track_10min_16mb/hardik_top5_run/train_gpt.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent 083396e commit c497bbf

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

records/track_10min_16mb/hardik_top5_run/train_gpt.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -411,7 +411,7 @@ def step(self, closure: Optional[Callable[[], float]] = None) -> float:
411411
# [5] MuonEq-R: normalize to unit Frobenius then re-scale
412412
if do_eq:
413413
fro = update.norm(dim=(-2, -1), keepdim=True).clamp_min(1e-8)
414-
target = m['eq_scale'] ** 0.5 # sqrt(sqrt(M*N)) ≈ balanced scale
414+
target = m['eq_scale'] # target Frobenius norm is sqrt(M*N)
415415
update = update * (target / fro)
416416

417417
if sharded:

0 commit comments

Comments
 (0)