#highlevel
- BERT (Bidirectional Encoder Representations from Transformers): BERT-Base, BERT-Large
- T5 (Text-to-Text Transfer Transformer):
- T5: A model that treats all NLP tasks as text-to-text tasks, unifying various NLP problems into a single framework.
- mT5: A multilingual version of T5.
- Switch Transformer: A model that uses sparsely activated expert layers, significantly increasing model size and efficiency.
- Bidirectional Training: BERT introduced bidirectional training, allowing the model to consider both left and right contexts in understanding language, greatly improving contextual understanding.
- Unified Framework for NLP: T5's text-to-text framework simplified the approach to NLP tasks, enabling a unified model to handle various tasks by simply converting them into a text format.
- Multilingual Models: Google's mT5 and other models expanded LLM capabilities to multiple languages, facilitating global applications.
- Efficient Large-Scale Models: The Switch Transformer introduced a way to scale models to extremely large sizes using sparse activation, making it more computationally efficient.
Gemma | Google DeepMind | February 21, 2024 | Open-Source | 2 billion, 7 billion |