Can we support distributed consumer grade GPUs ?
- https://github.com/EleutherAI/gpt-neox?tab=readme-ov-file#advanced-custom-launching
- https://app.primeintellect.ai/intelligence could allow for distributed training
Custom data loading
- shttps://github.com/EleutherAI/gpt-neox?tab=readme-ov-file#using-custom-data
Should we be using GPTNeoXTokenizerFast ?
Possible next UpWork task
- Review of experiment setup
- Can TPU v2-8 demonstrate multi-GPU setups in Colab?
- Run experiment in parallel
  - Separate processes in Colab, different dirs for results etc
- Reduce startup time in colab for experiments
- A colab compatible version of pgt-neox would be great
- Free TPU for research https://sites.research.google/trc/about/
- Get running on TPU and benchmark
- Shakespeare HF model should also save the tokenizer
  - Unclear how to do that because not using a HF teokenizer - maybe we can ?
Get training running with CPU (for offline dev)
- move from vvirtual env to pdm

Provide feedback

Saved searches