Uses reinforcement learning
to encourage roneneldan/TinyStories-33M
to generate stories with alliteration
Docs are here
If you install uv, it'll get the dependencies.
Backup plan: ./build.sh and ./run.sh will build and run a Docker container
that has uv, in case your system is weird (like my NixOS laptop)
and doesn't work with uv.
Once you're in the container,
you can run the commands in the following sections.
uv run src/tiny_stories_rl/train.pyThe KL penalty coefficient is configurable via --kl-coefficient;
see here for more.
uv run pytest tests