[nightshift] 20260421 multi-cleanup by claude-nightshift[bot] · Pull Request #4998 · marin-community/marin

claude-nightshift · 2026-04-21T11:22:38Z

seed 900815dc

Unreached branches fall —
Dead timers hush in silence,
Repo breathes lighter.

Nightshift parallel cleanup run. Three scouts (levanter, marin, zephyr) produced independent dead-code removals; the iris scout found only already-committed changes and is not included here.

Combined summary

`lib/levanter/src/levanter`

Removed dead timing instrumentation that accumulated values nobody read. In callbacks.eval_loss_loop, total_load_time and total_loss_time were computed every iteration but never returned or logged, forcing a manual while/iter_/next(None sentinel) loop purely to measure batches vs. loss time — switched to a straightforward for loop and dropped the dead variables. In LoadingTimeTrackerIterator, the this_load_time attribute is only ever written and the init-time perf_counter wrap around iter() measured nothing meaningful; inlined the remaining total_time update and deleted both. Unit tests in tests/test_metrics.py (including test_eval_loss_loop), tests/test_eval.py, and tests/test_logging.py all pass.

`lib/marin/src/marin`

Removed lib/marin/src/marin/transform/fasttext/transform.py, a 98-line orphan module. Its top-level functions (convert_fasttext_to_dolma_format, TransformFasttextToDolmaConfig, generate_id) are not imported anywhere in the repo (marin source, experiments, or tests). The docstring advertised a python transform.py ... CLI but no __main__/argparse block existed. The conversion logic also contained a redundant self-overwriting assignment to text introduced in Oct 2024 and never corrected. Deleting the file (and its now-empty directory) removes ~100 lines of confirmed dead code.

`lib/zephyr/src/zephyr`

Removed the dead requires_full_shard boolean field from the Map physical-op dataclass in plan.py. The field was computed identically to needs_shard_context (both set to any(isinstance(op, MapShardOp) for op in pending_fusible)) but was never read anywhere in the codebase or tests — only needs_shard_context is consumed in run_stage at plan.py:792 to decide whether to pass ShardInfo to the fused function. Git blame showed the field was a vestige from commit ac67ba7 where needs_shard_context was introduced and requires_full_shard was not cleaned up. Removed 5 occurrences (docstring attribute, field declaration, local computation, and the constructor argument in FusionState.flush_pending) while preserving semantics.

Verification

./infra/pre-commit.py --all-files --fix — clean
lib/levanter/tests/test_metrics.py, test_eval.py, test_logging.py — 39 passed
lib/zephyr/tests/test_optimization.py, test_dataset.py, test_groupby.py — 107 passed

…s_loop and LoadingTimeTrackerIterator The callbacks.eval_loss_loop accumulated total_load_time and total_loss_time but never returned or logged them, so the manual while/next batching was only serving dead variables. Switch to a plain for loop. LoadingTimeTrackerIterator.this_load_time is only ever written, never read; the init-time timing around iter() was also measuring nothing useful. Drop both.

transform/fasttext/transform.py was an orphan: no callers in marin, experiments, or tests, and its "Usage: python transform.py ..." docstring was a lie (no argparse/__main__). The function also contained a redundant assignment to ``text`` (one line overwritten by the next) dating to Oct 2024. Delete the whole file and its (now-empty) directory. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ield from Map dataclass

rjpower · 2026-04-21T16:27:31Z

@claude looks like the cleanup took something it should have, or forgot to update a test.

18.70s call tests/processing/classification/deduplication/test_fuzzy.py::test_fuzzy_dups_single_source_schema_and_pair
=========================== short test summary info ============================
FAILED tests/test_grug_variant_contracts.py::test_grug_base_run_emits_expected_metrics_with_json_tracker - AttributeError: 'LoadingTimeTrackerIterator' object has no attribute 'this_load_time'
====== 1 failed, 512 passed, 112 skipped, 7 warnings in 345.96s (0:05:45) ======

pls fix

claude · 2026-04-21T16:28:11Z

Claude finished @rjpower's task in 3m 45s —— View job

Task list

Inspect failing test and find where this_load_time is read
Restore this_load_time on LoadingTimeTrackerIterator
Run failing test plus related metric/logging tests
Run ./infra/pre-commit.py --all-files --fix
Commit and push

What happened

The scout grepped lib/ and src/ for readers of this_load_time, found none, and dropped the attribute — but three experiment trainers still read it on every step:

experiments/grug/base/train.py:470
experiments/grug/moe/train.py:512
experiments/grug/modular_opt/train.py:470

Each one does:

levanter.tracker.log({"throughput/loading_time": iterator.this_load_time}, step=step)

And tests/test_grug_variant_contracts.py::test_grug_base_run_emits_expected_metrics_with_json_tracker asserts the throughput/loading_time key is present in the summary (line 253). So removing the attribute turned a silent "only-written field" claim into an AttributeError inside the grug training loop.

Fix

Restore both this_load_time and the per-step assignment in lib/levanter/src/levanter/utils/logging.py. Kept the cleanup of the init-time perf_counter wrapping iter() (that one really wasn't used anywhere). total_time stays consistent with this_load_time.

class LoadingTimeTrackerIterator(Iterator[T]):
    def __init__(self, items: Iterable[T]):
        self.total_time = 0.0
        self.this_load_time = 0.0
        self.items = iter(items)

    def __next__(self) -> T:
        start = time.perf_counter()
        item = next(self.items)
        self.this_load_time = time.perf_counter() - start
        self.total_time += self.this_load_time
        return item

Test & lint status

tests/test_grug_variant_contracts.py → 7 passed (incl. the previously failing test_grug_base_run_emits_expected_metrics_with_json_tracker)
lib/levanter/tests/{test_metrics,test_eval,test_logging}.py → 39 passed (same set the PR originally verified)
./infra/pre-commit.py --all-files --fix → clean (ruff, black, pyrefly, license headers, AST, etc. all OK)

Fix pushed as 5f2b50d55 on branch nightshift/cleanup-20260421.
• branch: nightshift/cleanup-20260421

The nightshift cleanup removed this_load_time after a src-only grep, but three experiment trainers (experiments/grug/{base,moe,modular_opt}/train.py) log it on every step as `throughput/loading_time`. The grug-base contract test was asserting this key is emitted, so dropping the attribute turned into an AttributeError at train time rather than a silent regression. Restore both the attribute and the per-step assignment; keep the total_time expressed in terms of this_load_time so the two stay consistent. Co-authored-by: Russell Power <rjpower@users.noreply.github.com>

Nightshift Scout and others added 3 commits April 21, 2026 11:14

[nightshift] lib/zephyr/src/zephyr: remove dead requires_full_shard f…

b71e651

…ield from Map dataclass

claude-nightshift Bot added agent-generated Created by automation/agent nightshift Automated nightshift fixes labels Apr 21, 2026

claude-nightshift Bot enabled auto-merge (squash) April 21, 2026 11:22

claude-nightshift Bot requested a review from dlwh April 21, 2026 11:22

rjpower approved these changes Apr 21, 2026

View reviewed changes

claude-nightshift Bot merged commit e59a495 into main Apr 21, 2026
37 checks passed

claude-nightshift Bot deleted the nightshift/cleanup-20260421 branch April 21, 2026 21:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[nightshift] 20260421 multi-cleanup#4998

[nightshift] 20260421 multi-cleanup#4998
claude-nightshift[bot] merged 4 commits intomainfrom
nightshift/cleanup-20260421

claude-nightshift Bot commented Apr 21, 2026

Uh oh!

rjpower commented Apr 21, 2026

Uh oh!

claude Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

claude-nightshift Bot commented Apr 21, 2026

Combined summary

lib/levanter/src/levanter

lib/marin/src/marin

lib/zephyr/src/zephyr

Verification

Uh oh!

rjpower commented Apr 21, 2026

Uh oh!

claude Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Task list

What happened

Fix

Test & lint status

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`lib/levanter/src/levanter`

`lib/marin/src/marin`

`lib/zephyr/src/zephyr`

claude Bot commented Apr 21, 2026 •

edited

Loading