Training Reasoning Sub-Skills in LLMs with Synthetic Data

Requirements

To install requirements:

conda env create -f environment.yml

conda activate reasoning-decomp

To run scripts you will also need to set your Huggingface token and WANDB key for monitoring and recording results:

export HF_TOKEN= <your token>

export WANDB_KEY= <your token>

To run training scripts and save the adapters, you will need to set the base directory where each adapter wil be saved as an environment variable

export BASE_DIR= <your local base dir>

If you want to run the sudoku or countdown scripts then you will need to load the sudoku and countdown datasets locally then set the local paths as environment variables

export SUDOKU_PATH= <your local path>

export COUNTDOWN_PATH= <your local path>

Training

To train the LORA adapters we have scripts for each dataset:

python scripts/train_baseline_zebra.py

python scripts/train_baseline_sudoku.py

python scripts/train_countdown_sos.py

For each script you can specify the size of the base model with the flag --base_model= <1B or 3B>. For example:

python scripts/train_baseline_zebra.py --base_model=1B

will fine-tune an adapter on top of Llama-3.2-1B-Instruct.

Evaluation

To evaluate the base model and the trained adapters we have the following scripts:

python scripts/eval_base_zebra.py

python scripts/eval_base_sudoku.py

python scripts/eval_adapter_zebra.py <adapter_dir>

python scripts/eval_adapter_sudoku.py <adapter_dir>

python scripts/eval_sos_zebra.py <adapter_dir>

python scripts/eval_sos_sudoku.py <adapter_dir>

For the adapter and sos adapter scripts you are required to specify the path to the adapter which was trained using the above scripts as a positional argument.

For each script you can specify the size of the base model with the flag --base_model= <1B or 3B> and specify the number of few-shot examples with the flag --few_shot= <int>. For example: For each eval script you can specify the number of few-shot examples as a parameter, the default is zero-shot. For example:

python scripts/eval_base_zebra.py --few_shot=3 --base_model=1B

will evaluate Llama-3.2-1B-Instruct on the zebra test set with 3 few shot examples.

Data

The Zebra dataset is downloaded directly from Huggingface.
The Sudoku dataset can be downloaded from Kaggle.

To generate the Countdown Dataset:

Prerequisites

Clone the Repository:

git clone https://github.com/kanishkg/stream-of-search.git
cd stream-of-search

Set Up the Environment:

Install Miniconda (if not already installed):

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

Create and activate a Conda environment:

conda create -n sos python=3.11
conda activate sos

Install required packages:
```
pip install -r requirements.txt
```

Generating the Countdown Dataset:
- Run the Script to Generate the Dataset:
```
 sh scripts/gen_task.sh
```
- This will generate the Countdown dataset
- Save this as countdown.json in the appropriate output directory
- This file will contain structured examples for the Countdown problem. Each entry includes the target number, available numbers, the optimal solution path, the whole search trajectory and a rating field.

Name		Name	Last commit message	Last commit date
Latest commit History 263 Commits
data		data
evals		evals
notebooks		notebooks
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Training Reasoning Sub-Skills in LLMs with Synthetic Data

Requirements

Training

Evaluation

Data

Prerequisites

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Cormac-C/llm-reasoning-decomp

Folders and files

Latest commit

History

Repository files navigation

Training Reasoning Sub-Skills in LLMs with Synthetic Data

Requirements

Training

Evaluation

Data

Prerequisites

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages