#highlevel

BERT (Bidirectional Encoder Representations from Transformers): BERT-Base, BERT-Large
T5 (Text-to-Text Transfer Transformer):
T5: A model that treats all NLP tasks as text-to-text tasks, unifying various NLP problems into a single framework.
mT5: A multilingual version of T5.
Switch Transformer: A model that uses sparsely activated expert layers, significantly increasing model size and efficiency.

Bidirectional Training: BERT introduced bidirectional training, allowing the model to consider both left and right contexts in understanding language, greatly improving contextual understanding.
Unified Framework for NLP: T5's text-to-text framework simplified the approach to NLP tasks, enabling a unified model to handle various tasks by simply converting them into a text format.
Multilingual Models: Google's mT5 and other models expanded LLM capabilities to multiple languages, facilitating global applications.
Efficient Large-Scale Models: The Switch Transformer introduced a way to scale models to extremely large sizes using sparse activation, making it more computationally efficient.


Gemma	Google DeepMind	February 21, 2024	Open-Source	2 billion, 7 billion

Provide feedback

Saved searches