[logbook] Delphi midtrain: clarify project goal hierarchy

ahmeda14960 · ahmeda14960 · commit d04dea5d62ab · 2026-04-23T17:06:06.000-07:00
Two overarching project goals: (1) predict midtraining loss
trajectory from smaller runs, (2) pick a good midtraining dataset.
The LR sweep tracked here is a subgoal feeding both tracks — LR has
to be solved before dataset-quality can be cleanly isolated.
diff --git a/.agents/logbooks/midtraining_delphi.md b/.agents/logbooks/midtraining_delphi.md
@@ -32,8 +32,17 @@
 
 ## Goal
 
+### Overarching project goals (two tracks)
+
+1. **Predict loss trajectory for midtraining.** Build a calibrated expectation for how math-eval loss evolves when you continue-train an existing pretrain checkpoint on a math-heavy dataset, so that larger/more-expensive midtrain runs can be scoped (token budget, LR schedule, decay shape) from smaller runs instead of guessed. This Delphi × Nemotron-CC-Math sweep is a primary data point for that predictor.
+2. **Pick a good midtraining dataset.** Compare candidate math-heavy datasets (the Nemotron-CC-Math quality tiers being our current anchor) on their effect on post-midtrain downstream evals, so that future runs don't waste compute on a poorly-curated corpus.
+
+### This logbook's subgoal (the concrete sweep tracked here)
+
 Run a small LR sweep that continues-trains the two smallest existing AdamH-trained Marin checkpoints on **10 B tokens of `nemotron_cc_math_v1/4plus`**, to de-risk a Mantis-style math-midtraining recipe before spending v4-512 / v4-1024 time on Delphi 1e22 / 1e23. We're looking for the highest peak LR that still drops the math-eval loss monotonically.
 
+This subgoal feeds both tracks: the LR-factor × final-loss pairs inform the loss-trajectory predictor (track 1), and the dataset is held fixed at `nemotron_cc_math_v1/4plus` so the signal is attributable to LR choice — a prerequisite for the dataset-selection track (2), which requires LR to be a solved variable before dataset-quality can be cleanly isolated.
+
 ## User-specified constraints
 
 - Start the midtraining peak LR at **2/3 of each base's own pretrain peak**, sweep around there.