Skip to content

Commit 219cb84

Browse files
committed
Update training results
1 parent 938c00b commit 219cb84

File tree

6 files changed

+950
-269
lines changed

6 files changed

+950
-269
lines changed

experiments/plantcad/README.md

Lines changed: 26 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -119,14 +119,18 @@ python -m experiments.plantcad.scripts.exp_pc1_train \
119119
rm -rf local_store/evaluation/dna-conservation*; python -m experiments.plantcad.scripts.exp_pc1_eval --prefix local_store --force_run_failed true
120120

121121
# Checkpoint upload
122-
find local_store | grep -E 'hf/step-[0-9]+$' | xargs -I {} echo "hf upload plantcad/_dev_marin_plantcad1_v2_train {} {} --repo-type model" | bash /dev/stdin
123-
122+
# HF checkpoints
123+
find local_store | grep -E 'hf/step-[0-9]+$' | xargs -I {} echo "hf upload plantcad/_dev_marin_plantcad1_v3_train {} {} --repo-type model" | bash /dev/stdin
124+
# Levanter checkpoints
124125
find local_store | grep -E 'checkpoints/step-[0-9]+$' | \
125126
grep -E 'step-26780$|step-24102|step-21424|step-18746$' | \
126-
xargs -I {} echo "hf upload plantcad/_dev_marin_plantcad1_v2_train {} {} --repo-type model" | \
127+
xargs -I {} echo "hf upload plantcad/_dev_marin_plantcad1_v3_train {} {} --repo-type model" | \
127128
bash /dev/stdin
128129
```
129130

131+
### Eval results
132+
133+
Iteration 1:
130134
```bash
131135
> python -m experiments.plantcad.misc.agg_eval_results
132136
roc_auc step checkpoint_path
@@ -145,7 +149,7 @@ roc_auc step
145149
0.593178 21749 hf://plantcad/_dev_marin_plantcad1_v1_train/local_store/checkpoints/plantcad-train-300m-r02-432442/hf/step-21749
146150
```
147151

148-
Second iteration:
152+
Iteration 2:
149153

150154
```
151155
python experiments/plantcad/misc/agg_eval_results.py
@@ -163,6 +167,24 @@ Second iteration:
163167
0.657452 26782 hf://plantcad/_dev_marin_plantcad1_v2_train/local_store/checkpoints/plantcad-train-600m-r12-7ea0fc/hf/step-26782
164168
```
165169

170+
Iteration 3:
171+
172+
```
173+
> python -m experiments.plantcad.misc.agg_eval_results
174+
roc_auc step checkpoint_path
175+
0.665902 2678 hf://plantcad/_dev_marin_plantcad1_v3_train/local_store/checkpoints/plantcad-train-600m-r16-a1bc43/hf/step-2678
176+
0.672563 5356 hf://plantcad/_dev_marin_plantcad1_v3_train/local_store/checkpoints/plantcad-train-600m-r16-a1bc43/hf/step-5356
177+
0.673937 8034 hf://plantcad/_dev_marin_plantcad1_v3_train/local_store/checkpoints/plantcad-train-600m-r16-a1bc43/hf/step-8034
178+
0.675633 10712 hf://plantcad/_dev_marin_plantcad1_v3_train/local_store/checkpoints/plantcad-train-600m-r16-a1bc43/hf/step-10712
179+
0.678089 13390 hf://plantcad/_dev_marin_plantcad1_v3_train/local_store/checkpoints/plantcad-train-600m-r16-a1bc43/hf/step-13390
180+
0.684904 16068 hf://plantcad/_dev_marin_plantcad1_v3_train/local_store/checkpoints/plantcad-train-600m-r16-a1bc43/hf/step-16068
181+
0.680056 18746 hf://plantcad/_dev_marin_plantcad1_v3_train/local_store/checkpoints/plantcad-train-600m-r16-a1bc43/hf/step-18746
182+
0.677681 21424 hf://plantcad/_dev_marin_plantcad1_v3_train/local_store/checkpoints/plantcad-train-600m-r16-a1bc43/hf/step-21424
183+
0.679077 24102 hf://plantcad/_dev_marin_plantcad1_v3_train/local_store/checkpoints/plantcad-train-600m-r16-a1bc43/hf/step-24102
184+
0.681293 26780 hf://plantcad/_dev_marin_plantcad1_v3_train/local_store/checkpoints/plantcad-train-600m-r16-a1bc43/hf/step-26780
185+
0.680195 26782 hf://plantcad/_dev_marin_plantcad1_v3_train/local_store/checkpoints/plantcad-train-600m-r16-a1bc43/hf/step-26782
186+
```
187+
166188
## EDA
167189

168190
Stats on kuleshov-group/Angiosperm_16_genomes:

experiments/plantcad/evaluation.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@
5252
class DnaEvalBaseConfig:
5353
"""Base configuration for DNA evaluation with fields needed for training callbacks"""
5454

55-
dataset_path: str = "plantcad/evolutionary-constraint-example"
55+
dataset_path: str = "plantcad/evolutionary-constraint"
5656
"""Dataset repository path"""
5757

5858
dataset_config: str | None = "10k"
@@ -730,7 +730,7 @@ def run_conservation_eval(config: DnaEvalConfig) -> dict[str, float]:
730730
# checkpoint_path="/path/to/hf/checkpoint",
731731
# device="cuda", # or "cpu" for CPU inference
732732
# num_workers=None, # defaults to number of GPUs
733-
# dataset_path="plantcad/evolutionary-constraint-example",
733+
# dataset_path="plantcad/evolutionary-constraint",
734734
# dataset_config="10k",
735735
# max_samples=1000
736736
# )

experiments/plantcad/misc/plantcad_scaling.py

Lines changed: 0 additions & 255 deletions
This file was deleted.

0 commit comments

Comments
 (0)