You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Feb 25, 2022. It is now read-only.
Copy file name to clipboardExpand all lines: README.md
+13-6Lines changed: 13 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
3
3
🎉 1T or bust my dudes 🎉
4
4
5
-
An implementation of model & data parallel [GPT2](https://openai.com/blog/better-language-models/) & [GPT3](https://arxiv.org/abs/2005.14165)-like models, with the ability to scale up to full GPT3 sizes (and possibly more!), using the [mesh-tensorflow](https://github.com/tensorflow/mesh) library.
5
+
An implementation of model & data parallel [GPT2](https://openai.com/blog/better-language-models/) & [GPT3](https://arxiv.org/abs/2005.14165)-like models, with the ability to scale up to full GPT3 sizes (and possibly more!), using the [mesh-tensorflow](https://github.com/tensorflow/mesh) library.
6
6
7
7
Training and inference supported on both TPUs and GPUs.
8
8
@@ -14,8 +14,19 @@ Also included are alternative model architectures and linear attention implement
Pretrained models will be released as they are finished training.
17
+
# Pretrained Models
18
18
19
+
**21/03/2021:**
20
+
21
+
We're proud to release two pretrained GPT-Neo models trained on The Pile, the weights and configs can be freely downloaded from [the-eye.eu](https://the-eye.eu/eleuther_staging/gptneo-release/).
For more information on how to get these set up, see the colab notebook, or read through the rest of the readme.
28
+
29
+
This repository will be (mostly) archived as we move focus to our GPU training repo, [GPT-Neox](https://github.com/EleutherAI/gpt-neox/)
19
30
# Setup
20
31
21
32
```bash
@@ -44,10 +55,6 @@ You can also choose to train GPTNeo locally on your GPUs. To do so, you can omit
44
55
Google colab provides tpu-v8s for free, which should be enough to finetune our models up to GPT3XL (1.5B parameter) sizes.
45
56
Click the above button to run through our example colab notebook.
46
57
47
-
# Downloading Pretrained Models
48
-
49
-
TODO
50
-
51
58
# Generating Text
52
59
53
60
Once you have a trained model, or you've downloaded one of our pre-trained models (coming soon), generating text is as simple as running the main.py script with the `--predict` flag on. You can pass a path to your prompt txt file with the `--prompt` flag, like so:
0 commit comments