Skip to content

Latest commit

 

History

History
10 lines (6 loc) · 948 Bytes

GPT-3.md

File metadata and controls

10 lines (6 loc) · 948 Bytes

#highlevel #model

https://en.wikipedia.org/wiki/GPT-3

Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by [[OpenAI]] in 2020.

Like its predecessor, GPT-2, it is a decoder-only[2] transformer model of deep neural network, which supersedes recurrence and convolution-based architectures with a technique known as "attention".[3] This attention mechanism allows the model to focus selectively on segments of input text it predicts to be most relevant.[4] GPT-3 has 175 billion parameters, each with 16-bit precision, requiring 350GB of storage since each parameter occupies 2 bytes. It has a context window size of 2048 tokens, and has demonstrated strong "zero-shot" and "few-shot" learning abilities on many tasks.[2]

On September 22, 2020, Microsoft announced that it had licensed GPT-3 exclusively. Others can still receive output from its public API, but only Microsoft has access to the underlying model.[5]