v0.1.1
What's new
Added 🎉
tokenizer_kwargsoptional argument to thewsd_torch_models.bem.BEM.predictmethod. This allows users to define key word arguments that can be passed to the sub word tokenizer that is downloaded from HuggingFace throughtransformers.AutoTokenizer.from_pretrained.- Added a
ValueErrorthat is raised within thewsd_torch_models.bem.BEM.predictwhen the number of predicted sense labels does not equal the number of tokens that were given that should have a predicted sense label. - Added
add_prefix_space=Trueargument to theAutoTokenizer.from_pretrainedmethod for all examples in theREADME.md,scripts/convert_and_upload_bem_model.py, andmodel_readmes/pymusas_bem.md. This is required as this is what the pre-trainedBEMmodels expect. - The devcontainers, found in
.devcontainer, have been improved so that they use the cached uv packages that have been installed at docker build time.
Commits
0bdd512 Correct file path to model checkpoint for English Small BEM
846ab0a Example using the larger model
86cb8b1 Correct spelling error of Engish on the model names