Skip to content

Replicating the SFT process of Olmo-3-7B-Instruct-SFT #644

@nkn002

Description

@nkn002

Hi,

Thank you for the work.

Currently, our team is trying to replicate the allenai/Olmo-3-7B-Instruct-SFT using the SFT script for our research project. However, we were unable to replicate the results of Olmo-3-7B-Instruct-SFT on olmes. Specifically, we were able to trained our SFT model based on applying your script on allenai/Olmo-3-1025-7B, but our SFT model's accuracy on ZebraLogic is weaker than the number in the technical report.

The dataset that we use to replicate the SFT process is Dolci-Instruct-SFT. However, we notice that in the command in the README of the SFT, the training data include a mixer of other datasets.

Right now, we are unable to use your beaker software, so we had to modify the code a little bit to bypass the beaker requirements. Here is the command we used to train our model:

PYTHONPATH=src torchrun \
  --standalone \
  --nnodes=1 \
  --nproc_per_node=4 \
  src/scripts/train/sft/Olmo-3-7B-SFT.py train \
  my-olmo3-7b-instruct-only-sft-full \
  "$BASE_CORE/model_and_optim" \
  ai2/jupiter \
  --dataset_path "$SFT_DATA" \
  --seq_len=32768 \
  --global_batch_size=131072 \
  --num_nodes=1 \
  --gpus_per_node=4 \

We have the following questions:

  1. Is there a way to conduct SFT without using Beaker, as we do not have a beaker account? If not, how can we create one?
  2. Which commands and settings did you use to finetune the base model to get the SFT version for Olmo-3-7B-Instruct-SFT?
  3. How did you evaluate the models? Did you use olmes or submit_eval_jobs.sh script in open-instruct
  4. What was the SFT loss of the Olmo-3-7B-Instruct-SFT model?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions