Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

README.md

Experiments

This folder contains two post-training codebases adapted for our experiments, as described in the DenseMixer blog post.

open-instruct

open-instruct provides the code and instructions for reproducing the experiments on OLMoE-1B-7B. It includes training on 6 datasets and evaluation on 7 tasks, which are listed within the open-instruct directory.

llama-factory

llama-factory contains the code and instructions for reproducing the experiments on Qwen1.5-MoE-A2.7B and Qwen3-30B-A3B-Base.

  • For Qwen1.5-MoE-A2.7B, we train on 6 datasets and evaluate on 7 tasks, as detailed in the llama-factory directory.

  • For Qwen3-30B-A3B-Base, we focus on two datasets: s1 (math reasoning) and nemotron-code (coding reasoning). Evaluation is conducted on challenging math and coding benchmarks that require reasoning capabilities.