Skip to content

Chapter 10, Page 313 #81

@absognety

Description

@absognety

Hi There, In Chapter 10 - Creating Text Embedding Models from Part III, in the section of Fine-Tuning an Embedding Model on Page 313, I think there is a typo in this sentence

After training our cross-encoder, we use the remaining 400,000 sentence pairs (from
our original dataset of 50,000 sentence pairs) as our silver dataset (step 2):

After taking subset of 10,000 documents, there would have been 40,000 documents as remaining from the original dataset of 50,000 sentence pairs.

This is after the following code sample

# Train a cross-encoder on the gold dataset
cross_encoder = CrossEncoder("bert-base-uncased", num_labels=2)
cross_encoder.fit(
train_dataloader=gold_dataloader,
epochs=1,
show_progress_bar=True,
warmup_steps=100,
use_amp=False
)

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions