Skip to content

关于rerank的训练和评估的疑问 #93

@zengwb-lx

Description

@zengwb-lx

你好,我看了你的训练代码中加载数据是:
class RerankTrainDataset(Dataset):
...
dataset = datasets.load_dataset(data_name_or_path)
dataset = dataset[dataset_split]

eval_retrieval2.py中加载数据是:
dataset = load_dataset("C-MTEB/T2Reranking", split="dev")
ds = dataset.train_test_split(test_size=0.1, seed=42)

这是不是相当于从训练集中加载了10%去做测试?

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions