Is context length extension training possible with Axolotl? #3306
-
|
This may be a silly question, but is context length extension training possible with Axolotl? I would like to extend the maximum supported context length for a model (OLMo 2 32B) which uses RoPE, ideally using QLoRA. Currently the max context length is only 4,096, but I would like to fine-tune it on documents consisting of 32K tokens. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
|
Hello, you can indeed change RoPE by updating/override the model config. Say you're using this model https://huggingface.co/allenai/OLMo-2-0325-32B-SFT/blob/main/config.json : You can set the rope_scaling similar to the Olmo3 32B model https://huggingface.co/allenai/Olmo-3-32B-Think/blob/main/config.json#L79-L94 overrides_of_model_config:
rope_scaling:
type: ${ROPE_SCALING_TYPE}
factor: ${ROPE_SCALING_FACTOR}
rope_theta: ${ROPE_THETA}
sequence_len: ${NEW_SEQ}Make sure that: the model itself Olmo2 in transformers supports rope config too. Curious question: why not start from Olmo3 which already has their context expanded? |
Beta Was this translation helpful? Give feedback.
Hello, you can indeed change RoPE by updating/override the model config. Say you're using this model https://huggingface.co/allenai/OLMo-2-0325-32B-SFT/blob/main/config.json :
You can set the rope_scaling similar to the Olmo3 32B model https://huggingface.co/allenai/Olmo-3-32B-Think/blob/main/config.json#L79-L94
Make sure that: the model itself Olmo2 in transformers supports rope config too.
Curious question: why not start from Olmo3 which already has their context expanded?