Skip to content

Commit d514ccb

Browse files
committed
update README.md
1 parent 5b21f8d commit d514ccb

File tree

1 file changed

+3
-11
lines changed

1 file changed

+3
-11
lines changed

README.md

Lines changed: 3 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -114,26 +114,18 @@ Achieves 78.6% on CUTE (vs 56.9% for Olmo 3) and 71.6% on EXECUTE benchmarks thr
114114
Unlike subword models, Bolmo can arbitrarily adjust the bytes-per-patch ratio to trade off speed for performance:
115115

116116
```python
117-
# Train with higher compression for faster inference
118-
torchrun --nproc-per-node=8 src/examples/bolmo/train_stage2.py \
119-
--target-compression=8.0 # vs default ~4.4
117+
TODO
120118
```
121119

122120
### 4. Zero-Cost Post-Training
123121
Existing post-trained checkpoints can be byteified without additional training using Task Arithmetic:
124122

125123
```python
126-
from olmo_core.nn.bolmo import byteify_checkpoint
127-
128-
# Merge post-trained checkpoint into Bolmo
129-
byteified_model = byteify_checkpoint(
130-
bolmo_base="allenai/Bolmo-7B",
131-
posttrain_checkpoint="allenai/OLMo-3-7B-Instruct"
132-
)
124+
TODO
133125
```
134126

135127
### 5. Efficient Training
136-
Total training cost: only 39.3B tokens (≈173B bytes) to byteify an existing model - orders of magnitude less than training from scratch.
128+
Total training cost: 9.8B tokens (≈43B bytes) for Stage 1, 39.3B tokens (≈173B bytes) for Stage 2 to byteify an existing model.
137129

138130
## Performance
139131

0 commit comments

Comments
 (0)