This repo aims to teach LLM implementations common mathematical algorithms using synthetic examples.
It focuses on algorithms which use O(n) space in the input, as this allows the LLM to run the entire algorithm within an at most linearly large context window, i.e. if algorithm M uses O(n) space, then I only need a context window of size cn (where c is a (possibly large, but hopefully small) constant) in my LLM in order to implement the algorithms.
Multiplication is one such algorithm -- my hypothesis is that arbtirary multiplication should actually be easy for LLMs, as long as they are given the right scaffolding.
The first algorithm I want to implement is multiplication via polynomial convolution. The dataset will be generated in Python.
After that, I want to generate synthetic reasoning chains for other O(n) algorithms, and hopefully if an LLM learns enough of these, it will become a strong mathematical reasoner.
Another hypothesis that I want to test is that LLMs benefit from an LSD first representation of numbers, which has to do with their LtR attention nature.