Open
Description
Currently only specific batch_sampler
values are possible
There seems to be a need to enable users to create custom batch samplers
Examples from issues:
- Batch sampler #3123
- Using DataLoader in v3? #2707
- MNRL with Multiple hard negatives per query and NoDuplicatesBatchSampler #2954
I believe that the current solution (that i found the hard way) that is suggested in the issues is to subclass SentenceTransformerTrainer
and override get_batch_sampler
, is not documented well enough and isn't straightforward
Is there any reason not to just accept anything that inherits from DefaultBatchSampler
(for example) as a parameter?