Hi,
Thank you for the work.
Currently, our team is trying to replicate the allenai/Olmo-3-7B-Instruct-SFT using the SFT script for our research project. However, we were unable to replicate the results of Olmo-3-7B-Instruct-SFT on olmes. Specifically, we were able to trained our SFT model based on applying your script on allenai/Olmo-3-1025-7B, but our SFT model's accuracy on ZebraLogic is weaker than the number in the technical report.
The dataset that we use to replicate the SFT process is Dolci-Instruct-SFT. However, we notice that in the command in the README of the SFT, the training data include a mixer of other datasets.
Right now, we are unable to use your beaker software, so we had to modify the code a little bit to bypass the beaker requirements. Here is the command we used to train our model:
PYTHONPATH=src torchrun \
--standalone \
--nnodes=1 \
--nproc_per_node=4 \
src/scripts/train/sft/Olmo-3-7B-SFT.py train \
my-olmo3-7b-instruct-only-sft-full \
"$BASE_CORE/model_and_optim" \
ai2/jupiter \
--dataset_path "$SFT_DATA" \
--seq_len=32768 \
--global_batch_size=131072 \
--num_nodes=1 \
--gpus_per_node=4 \
We have the following questions:
- Is there a way to conduct SFT without using Beaker, as we do not have a beaker account? If not, how can we create one?
- Which commands and settings did you use to finetune the base model to get the SFT version for Olmo-3-7B-Instruct-SFT?
- How did you evaluate the models? Did you use olmes or submit_eval_jobs.sh script in open-instruct
- What was the SFT loss of the Olmo-3-7B-Instruct-SFT model?
Hi,
Thank you for the work.
Currently, our team is trying to replicate the allenai/Olmo-3-7B-Instruct-SFT using the SFT script for our research project. However, we were unable to replicate the results of Olmo-3-7B-Instruct-SFT on olmes. Specifically, we were able to trained our SFT model based on applying your script on allenai/Olmo-3-1025-7B, but our SFT model's accuracy on ZebraLogic is weaker than the number in the technical report.
The dataset that we use to replicate the SFT process is Dolci-Instruct-SFT. However, we notice that in the command in the README of the SFT, the training data include a mixer of other datasets.
Right now, we are unable to use your beaker software, so we had to modify the code a little bit to bypass the beaker requirements. Here is the command we used to train our model:
We have the following questions: