difficulty_aware_looped_transformer

Last Update: 2025-08-12

Find a training / Inference framework to use the optimal number of loops for problem with various level of difficulty.

install

pip install torch numpy transformers datasets tiktoken wandb tqdm

Dependencies:

pytorch <3
numpy <3
transformers for huggingface transformers <3 (to load GPT-2 checkpoints)
datasets for huggingface datasets <3 (if you want to download + preprocess OpenWebText)
tiktoken for OpenAI's fast BPE code <3
wandb for optional logging <3
tqdm for progress bars <3

Dataset Generation

p-Hop Induction Task

In this task we define a p-hop induction task with three difficulty levels:

.
├── LICENSE
├── README.md
└── data
    └── phop

Level	p	vocab_size	seq_len	num_loops (l)
1	16	4	256	3
2	32	8	512	6
3	64	16	1024	12

First randomly sample and generate the dataset in txt format for all three difficulty levels. Each contains train/test and eval data. Eval data contains 100 p-hop steps.

For each level, I will create 4M training samples. I then train a looped transformer on those data using different number of loops. I will make sure the model BLOCK_SIZE is greater than 1024 for each level to handle any seqeunce up to 1024 sequence length. I will then evaluate the model on the eval data with different number of loops for each level of difficulty.

Curiculum Learning Setup

Without curriculum learning, the mixed p-hop sequences of various difficulty levels. The number of training loop are randomized. The validation loss is not good.

With curriculum learning, the model learn on simplier phop sequence first using less number of loops. In on batch, the problem will be drawn from the same level of difficulty. Level of difficulty is pre-defined by human.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.vscode		.vscode
assets		assets
config		config
data		data
eval		eval
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
configurator.py		configurator.py
model.py		model.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

difficulty_aware_looped_transformer

install

Dataset Generation

p-Hop Induction Task

Curiculum Learning Setup

Experiment and Training

TODO

About

Uh oh!

Releases

Packages

Languages

License

JazzikPeng/difficulty_aware_looped_transformer

Folders and files

Latest commit

History

Repository files navigation

difficulty_aware_looped_transformer

install

Dataset Generation

p-Hop Induction Task

Curiculum Learning Setup

Experiment and Training

TODO

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages