[nightshift] 20260421 multi-cleanup#4998
Conversation
…s_loop and LoadingTimeTrackerIterator The callbacks.eval_loss_loop accumulated total_load_time and total_loss_time but never returned or logged them, so the manual while/next batching was only serving dead variables. Switch to a plain for loop. LoadingTimeTrackerIterator.this_load_time is only ever written, never read; the init-time timing around iter() was also measuring nothing useful. Drop both.
transform/fasttext/transform.py was an orphan: no callers in marin, experiments, or tests, and its "Usage: python transform.py ..." docstring was a lie (no argparse/__main__). The function also contained a redundant assignment to ``text`` (one line overwritten by the next) dating to Oct 2024. Delete the whole file and its (now-empty) directory. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ield from Map dataclass
|
@claude looks like the cleanup took something it should have, or forgot to update a test. 18.70s call tests/processing/classification/deduplication/test_fuzzy.py::test_fuzzy_dups_single_source_schema_and_pair pls fix |
|
Claude finished @rjpower's task in 3m 45s —— View job Task list
What happenedThe scout grepped
Each one does: levanter.tracker.log({"throughput/loading_time": iterator.this_load_time}, step=step)And FixRestore both class LoadingTimeTrackerIterator(Iterator[T]):
def __init__(self, items: Iterable[T]):
self.total_time = 0.0
self.this_load_time = 0.0
self.items = iter(items)
def __next__(self) -> T:
start = time.perf_counter()
item = next(self.items)
self.this_load_time = time.perf_counter() - start
self.total_time += self.this_load_time
return itemTest & lint status
Fix pushed as |
The nightshift cleanup removed this_load_time after a src-only grep, but
three experiment trainers (experiments/grug/{base,moe,modular_opt}/train.py)
log it on every step as `throughput/loading_time`. The grug-base contract
test was asserting this key is emitted, so dropping the attribute turned
into an AttributeError at train time rather than a silent regression.
Restore both the attribute and the per-step assignment; keep the total_time
expressed in terms of this_load_time so the two stay consistent.
Co-authored-by: Russell Power <rjpower@users.noreply.github.com>
Nightshift parallel cleanup run. Three scouts (levanter, marin, zephyr) produced independent dead-code removals; the iris scout found only already-committed changes and is not included here.
Combined summary
lib/levanter/src/levanterRemoved dead timing instrumentation that accumulated values nobody read. In
callbacks.eval_loss_loop,total_load_timeandtotal_loss_timewere computed every iteration but never returned or logged, forcing a manualwhile/iter_/next(None sentinel)loop purely to measure batches vs. loss time — switched to a straightforwardforloop and dropped the dead variables. InLoadingTimeTrackerIterator, thethis_load_timeattribute is only ever written and the init-timeperf_counterwrap arounditer()measured nothing meaningful; inlined the remainingtotal_timeupdate and deleted both. Unit tests intests/test_metrics.py(includingtest_eval_loss_loop),tests/test_eval.py, andtests/test_logging.pyall pass.lib/marin/src/marinRemoved
lib/marin/src/marin/transform/fasttext/transform.py, a 98-line orphan module. Its top-level functions (convert_fasttext_to_dolma_format,TransformFasttextToDolmaConfig,generate_id) are not imported anywhere in the repo (marin source, experiments, or tests). The docstring advertised apython transform.py ...CLI but no__main__/argparse block existed. The conversion logic also contained a redundant self-overwriting assignment totextintroduced in Oct 2024 and never corrected. Deleting the file (and its now-empty directory) removes ~100 lines of confirmed dead code.lib/zephyr/src/zephyrRemoved the dead
requires_full_shardboolean field from theMapphysical-op dataclass inplan.py. The field was computed identically toneeds_shard_context(both set toany(isinstance(op, MapShardOp) for op in pending_fusible)) but was never read anywhere in the codebase or tests — onlyneeds_shard_contextis consumed inrun_stageatplan.py:792to decide whether to passShardInfoto the fused function. Git blame showed the field was a vestige from commit ac67ba7 whereneeds_shard_contextwas introduced andrequires_full_shardwas not cleaned up. Removed 5 occurrences (docstring attribute, field declaration, local computation, and the constructor argument inFusionState.flush_pending) while preserving semantics.Verification
./infra/pre-commit.py --all-files --fix— cleanlib/levanter/tests/test_metrics.py,test_eval.py,test_logging.py— 39 passedlib/zephyr/tests/test_optimization.py,test_dataset.py,test_groupby.py— 107 passed