Skip to content

Differences and exact configs of allenai/OLMo-1B-0724-hf vs allenai/OLMo-1B-hf #901

@AntonioLopardo

Description

@AntonioLopardo

❓ The question

Hi! I'm trying to understand the differences between these two models. Looking at the HF configs and model pages I found the following ones:

  • Difference training process, 1 stage vs 2 stages and different version of dolma v1_5 vs v_1_7
  • Context length for training, 2048 vs 4096
  • embedding weight tying , tied vs untied
  • clip_qkv, null vs 8.0

I also looked at the config at configs/official-0724/OLMo-1B.yaml but can't figure out it is supposed to be the 0724 version or the first one. Are there training runs/exact configs for these exact models?

Thanks for any info!

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/questionAn issue that's a question

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions