Differences and exact configs of allenai/OLMo-1B-0724-hf vs allenai/OLMo-1B-hf

### ❓ The question

Hi! I'm trying to understand the differences between these two models. Looking at the HF configs and model pages I found the following ones:

- Difference training process, 1 stage vs 2 stages and different version of dolma v1_5 vs v_1_7
- Context length for training, 2048 vs 4096
- embedding weight tying , tied vs untied
- clip_qkv, null vs 8.0

I also looked at the config at `configs/official-0724/OLMo-1B.yaml` but can't figure out it is supposed to be the 0724 version or the first one. Are there training runs/exact configs for these exact models?

Thanks for any info!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Differences and exact configs of allenai/OLMo-1B-0724-hf vs allenai/OLMo-1B-hf #901

❓ The question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Differences and exact configs of allenai/OLMo-1B-0724-hf vs allenai/OLMo-1B-hf #901

Description

❓ The question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions