Release v0.1.1 · UCREL/WSD-Torch-Models

What's new

Added 🎉

tokenizer_kwargs optional argument to the wsd_torch_models.bem.BEM.predict method. This allows users to define key word arguments that can be passed to the sub word tokenizer that is downloaded from HuggingFace through transformers.AutoTokenizer.from_pretrained.
Added a ValueError that is raised within the wsd_torch_models.bem.BEM.predict when the number of predicted sense labels does not equal the number of tokens that were given that should have a predicted sense label.
Added add_prefix_space=True argument to the AutoTokenizer.from_pretrained method for all examples in the README.md, scripts/convert_and_upload_bem_model.py, and model_readmes/pymusas_bem.md. This is required as this is what the pre-trained BEM models expect.
The devcontainers, found in .devcontainer, have been improved so that they use the cached uv packages that have been installed at docker build time.

Commits

0bdd512 Correct file path to model checkpoint for English Small BEM
846ab0a Example using the larger model
86cb8b1 Correct spelling error of Engish on the model names

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.1

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's new

Added 🎉

Commits

Uh oh!