-
Notifications
You must be signed in to change notification settings - Fork 253
Description
setfit==1.1.3
I have train and eval dataset:
Dataset({
features: ['sents', 'type'],
num_rows: 350
})
Dataset({
features: ['sents', 'type'],
num_rows: 217
})
Then I run trainer with evaluate every 10 steps:
args = TrainingArguments(
batch_size=2,
num_epochs=1,
sampling_strategy="unique",
report_to="tensorboard",
logging_dir='./logs',
eval_strategy="steps",
eval_steps=10,
end_to_end=False,
)
trainer = Trainer(
model=model,
args=args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
column_mapping={"sents":"text", "type":"label"},
metric="accuracy"
)
trainer.train()
Output:
***** Running training *****
Num unique pairs = 61425
Batch size = 2
Num epochs = 1
[ 11/30713 00:03 < 2:54:45, 2.93 it/s, Epoch 0.00/1]
[ 74/11827 00:03 < 10:06, 19.36 it/s]o
Why the evaluator calculates 11827 rows instead of 217 rows in eval dataset.
Also if I pass custom evaluator function through metric param, trainer ingores it.