Skip to content

v0.1.1

Choose a tag to compare

@apmoore1 apmoore1 released this 03 Dec 21:08
· 17 commits to main since this release
v0.1.1
019f237

What's new

Added 🎉

  • tokenizer_kwargs optional argument to the wsd_torch_models.bem.BEM.predict method. This allows users to define key word arguments that can be passed to the sub word tokenizer that is downloaded from HuggingFace through transformers.AutoTokenizer.from_pretrained.
  • Added a ValueError that is raised within the wsd_torch_models.bem.BEM.predict when the number of predicted sense labels does not equal the number of tokens that were given that should have a predicted sense label.
  • Added add_prefix_space=True argument to the AutoTokenizer.from_pretrained method for all examples in the README.md, scripts/convert_and_upload_bem_model.py, and model_readmes/pymusas_bem.md. This is required as this is what the pre-trained BEM models expect.
  • The devcontainers, found in .devcontainer, have been improved so that they use the cached uv packages that have been installed at docker build time.

Commits

0bdd512 Correct file path to model checkpoint for English Small BEM
846ab0a Example using the larger model
86cb8b1 Correct spelling error of Engish on the model names