Replicating the SFT process of Olmo-3-7B-Instruct-SFT

Hi,

Thank you for the work. 

Currently, our team is trying to replicate the allenai/Olmo-3-7B-Instruct-SFT using the [SFT script](https://github.com/allenai/OLMo-core/tree/main/src/scripts/train/sft) for our research project. However, we were unable to replicate the results of Olmo-3-7B-Instruct-SFT on [olmes](https://github.com/allenai/olmes/blob/main/oe_eval/configs/task_suites.py). Specifically, we were able to trained our SFT model based on applying your script on allenai/Olmo-3-1025-7B, but our SFT model's accuracy on ZebraLogic is weaker than the number in the technical report.

The dataset that we use to replicate the SFT process is [Dolci-Instruct-SFT](https://huggingface.co/datasets/allenai/Dolci-Instruct-SFT). However, we notice that in the command in the [README of the SFT](https://github.com/allenai/OLMo-core/tree/main/src/scripts/train/sft#:~:text=Launching%20with%20mason.py%20is%20the%20recommended%20way%20to%20run%20scripts%20in%20open%2Dinstruct.%20See%20this%20example%20script.), the training data include a mixer of other datasets.

Right now, we are unable to use your beaker software, so we had to modify the code a little bit to bypass the beaker requirements. Here is the command we used to train our model:

```
PYTHONPATH=src torchrun \
  --standalone \
  --nnodes=1 \
  --nproc_per_node=4 \
  src/scripts/train/sft/Olmo-3-7B-SFT.py train \
  my-olmo3-7b-instruct-only-sft-full \
  "$BASE_CORE/model_and_optim" \
  ai2/jupiter \
  --dataset_path "$SFT_DATA" \
  --seq_len=32768 \
  --global_batch_size=131072 \
  --num_nodes=1 \
  --gpus_per_node=4 \
```


We have the following questions:
1. Is there a way to conduct SFT without using Beaker, as we do not have a beaker account? If not, how can we create one?
2. Which commands and settings did you use to finetune the base model to get the SFT version for Olmo-3-7B-Instruct-SFT?
3. How did you evaluate the models? Did you use [olmes](https://github.com/allenai/olmes/tree/main) or [submit_eval_jobs.sh script in open-instruct](https://github.com/allenai/OLMo-core/tree/main/src/scripts/train/sft#:~:text=submit_eval_jobs.sh%20script%20in%20open%2Dinstruct)
4. What was the SFT loss of the Olmo-3-7B-Instruct-SFT model?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replicating the SFT process of Olmo-3-7B-Instruct-SFT #644

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Replicating the SFT process of Olmo-3-7B-Instruct-SFT #644

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions