Hugging Face Transformers

Check point

You may often want to save and continue a state of training using a checkpoint. Doing so requires saving and loading the model, optimizer, RNG generators, and the GradScaler. The tokenizer and model should always be from the same checkpoint! A checkpoint is a model, it is a modified version of a base model.

Heads

An additional component, usually made up of one or a few layers, to convert the transformer predictions to a task-specific output

AutoModel

An object that returns the correct architecture based on the checkpoint.

Sigmoid vs SoftMax

Sigmoid is used for binary classification methods where we only have 2 classes, while SoftMax applies to multiclass problems. In fact, the SoftMax function is an extension of the Sigmoid function.

Techniques to be aware of when batching sequences of different lengths together

Truncating, Padding and Attention Masking

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hugging Face Transformers

Check point

Heads

AutoModel

Sigmoid vs SoftMax

Techniques to be aware of when batching sequences of different lengths together

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally