Skip to content
This repository was archived by the owner on Feb 25, 2022. It is now read-only.

Commit 67c4079

Browse files
author
sid
committed
cleanup + add pretrained models
1 parent 0299de6 commit 67c4079

32 files changed

+1026
-1305
lines changed

GPTNeo_example_notebook.ipynb

Lines changed: 265 additions & 36 deletions
Large diffs are not rendered by default.

README.md

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
🎉 1T or bust my dudes 🎉
44

5-
An implementation of model & data parallel [GPT2](https://openai.com/blog/better-language-models/) & [GPT3](https://arxiv.org/abs/2005.14165)-like models, with the ability to scale up to full GPT3 sizes (and possibly more!), using the [mesh-tensorflow](https://github.com/tensorflow/mesh) library.
5+
An implementation of model & data parallel [GPT2](https://openai.com/blog/better-language-models/) & [GPT3](https://arxiv.org/abs/2005.14165) -like models, with the ability to scale up to full GPT3 sizes (and possibly more!), using the [mesh-tensorflow](https://github.com/tensorflow/mesh) library.
66

77
Training and inference supported on both TPUs and GPUs.
88

@@ -14,8 +14,19 @@ Also included are alternative model architectures and linear attention implement
1414
* [Axial Positional embedding](https://arxiv.org/abs/1912.12180)
1515
* Masked Language Modelling
1616

17-
Pretrained models will be released as they are finished training.
17+
# Pretrained Models
1818

19+
**21/03/2021:**
20+
21+
We're proud to release two pretrained GPT-Neo models trained on The Pile, the weights and configs can be freely downloaded from [the-eye.eu](https://the-eye.eu/eleuther_staging/gptneo-release/).
22+
23+
1.3B: https://the-eye.eu/eleuther_staging/gptneo-release/GPT3_XL/
24+
25+
2.7B: https://the-eye.eu/eleuther_staging/gptneo-release/GPT3_2-7B/
26+
27+
For more information on how to get these set up, see the colab notebook, or read through the rest of the readme.
28+
29+
This repository will be (mostly) archived as we move focus to our GPU training repo, [GPT-Neox](https://github.com/EleutherAI/gpt-neox/)
1930
# Setup
2031

2132
```bash
@@ -44,10 +55,6 @@ You can also choose to train GPTNeo locally on your GPUs. To do so, you can omit
4455
Google colab provides tpu-v8s for free, which should be enough to finetune our models up to GPT3XL (1.5B parameter) sizes.
4556
Click the above button to run through our example colab notebook.
4657

47-
# Downloading Pretrained Models
48-
49-
TODO
50-
5158
# Generating Text
5259

5360
Once you have a trained model, or you've downloaded one of our pre-trained models (coming soon), generating text is as simple as running the main.py script with the `--predict` flag on. You can pass a path to your prompt txt file with the `--prompt` flag, like so:

configs/dataset_configs/SmallPileAblation_small_CC100_newinput.json

Lines changed: 0 additions & 9 deletions
This file was deleted.

configs/dataset_configs/SmallPileAblation_small_CC_raw_newinput.json

Lines changed: 0 additions & 9 deletions
This file was deleted.

configs/dataset_configs/SmallPileAblation_small_Pile_newinput.json

Lines changed: 0 additions & 9 deletions
This file was deleted.

configs/dataset_configs/SmallPileAblation_small_owt_newinput.json

Lines changed: 0 additions & 9 deletions
This file was deleted.

configs/dataset_configs/cc100en_40G_ablation.json

Lines changed: 0 additions & 9 deletions
This file was deleted.

configs/dataset_configs/cc_raw_40G_ablation.json

Lines changed: 0 additions & 9 deletions
This file was deleted.

configs/dataset_configs/openwebtext-documents.json

Lines changed: 0 additions & 9 deletions
This file was deleted.

configs/dataset_configs/owt_40G_ablation.json

Lines changed: 0 additions & 9 deletions
This file was deleted.

0 commit comments

Comments
 (0)