Skip to content

Latest commit

 

History

History
19 lines (19 loc) · 1013 Bytes

ideas.md

File metadata and controls

19 lines (19 loc) · 1013 Bytes
  • Can we support distributed consumer grade GPUs ?
  • Custom data loading
    • shttps://github.com/EleutherAI/gpt-neox?tab=readme-ov-file#using-custom-data
  • Should we be using GPTNeoXTokenizerFast ?
  • Possible next UpWork task
    • Review of experiment setup
    • Can TPU v2-8 demonstrate multi-GPU setups in Colab?
    • Run experiment in parallel
      • Separate processes in Colab, different dirs for results etc
    • Reduce startup time in colab for experiments
    • A colab compatible version of pgt-neox would be great
    • Free TPU for research https://sites.research.google/trc/about/
    • Get running on TPU and benchmark
    • Shakespeare HF model should also save the tokenizer
      • Unclear how to do that because not using a HF teokenizer - maybe we can ?
  • Get training running with CPU (for offline dev)
    • move from vvirtual env to pdm