Bidirectional Transformers for Language Understanding (BERT)

Bidirectional Transformers for Language Understanding (BERT) is an encoder-only transformer-based model designed for natural language understanding. This directory contains implementations of the BERT model. It uses a stack of transformer blocks with multi-head attention followed by a multi-layer perceptron feed-forward network. We support removing next-sentence-prediction (NSP) loss from BERT training processing with only masked-language-modeling (MLM) loss.

For more information on using our BERT implementation, visit its model page in our documentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bidirectional Transformers for Language Understanding (BERT)

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Bidirectional Transformers for Language Understanding (BERT)