Skip to content

Commit bce89dc

Browse files
yonromaiyoblin
authored andcommitted
Fix canary validation artifact replication (#4173)
## Summary - restore canary `tracker_metrics.jsonl` emission by setting `replicate_path=this_output_path()` on the canary W&B tracker - add a regression test that reloads the canary step and asserts metrics replication stays configured - fix the TPU canary post-run validation failure seen in scheduled runs such as 23581948912, where training succeeded but metrics validation failed because the artifact was never written ## Testing - `./infra/pre-commit.py experiments/ferries/canary_ferry.py tests/test_validate_canary_metrics.py` - `uv run --with pytest --with pytest-timeout --with pytest-xdist python -m pytest tests/test_validate_canary_metrics.py` --------- Co-authored-by: yoblin <268258002+yoblin@users.noreply.github.com>
1 parent 4f7d13c commit bce89dc

1 file changed

Lines changed: 1 addition & 0 deletions

File tree

experiments/ferries/canary_ferry.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,7 @@ def _build_step_from_env() -> ExecutorStep:
138138
tags=wandb_tags,
139139
group=wandb_group,
140140
name=None,
141+
replicate_path=this_output_path(),
141142
),
142143
optimizer=versioned(CANARY_OPTIMIZER),
143144
grug_trainer=versioned(CANARY_TRAINER),

0 commit comments

Comments
 (0)