Skip to content

Releases: UCREL/WSD-Torch-Models

v0.1.3

29 Apr 18:04
v0.1.3
2b0bd7b

Choose a tag to compare

What's new

Added 🎉

  • Added the arXiv paper to the PyMUSAS BEM model readme, model_readmes/pymusas_bem.md, of which this did require the Bib text to be Python string variable in the convert and upload script scripts/convert_and_upload_bem_model.py.
  • Created ./benchmarks/ to benchmark speed and memory performance.
  • Created ./tests/functional_tests/ to test the whole function of the package, an end to end test of the package. This is mainly to ensure that any changes to the code base does not affect the performance with respect to the accuracy of the existing models.

Changed ⚠️

  • Changed the ./pyproject.toml so that local developers can easily install different versions of torch, i.e. cpu or different cuda versions.
  • Updated the ./.github/workflows to use specific GitHub action versions, this should make the workflow more secure.
  • Updated the ./.devcontainer files so that they use the correct version of torch.
  • Changed the developer tools from isort, flake8, mypy to ruff and ty.

Commits

3c09f93 Prepare for release v0.1.3
11be7ed Upgraded to tranformers >=4.54.0,<6.0
37ec57a Results for CPU
e71b992 Changed to RUFF and TY
13911bd Benchmark dependencies are now in their own optional group
f4fe491 Reset model cache
a2794b8 Benchmarks for end to end testing
60ad96b Fixed linting issue
85814b4 Fixed torch development versions
9736910 Added arXiv paper citation to the PyMUSAS BEM models
dcb92e6 Bash script to automate uploading model checkpoints as branch to HF repo
f08b007 Added link to v0.1.2 in CHANGELOG.md

v0.1.2

10 Dec 18:33
v0.1.2
3730218

Choose a tag to compare

What's new

Changed ⚠️

  • The version of numpy has been relaxed from numpy>=2.0.0,<3.0 to numpy>=1.19.0,<3.0 so that we can use the GPU within a spacy pipeline due to spacy's dependency on cupy version cupy-cuda12x>=11.5.0,<13.0.0.

Commits

8d8b133 Relaxed the version of numpy
3150049 Added link to release in CHANGELOG.md

v0.1.1

03 Dec 21:08
v0.1.1
019f237

Choose a tag to compare

What's new

Added 🎉

  • tokenizer_kwargs optional argument to the wsd_torch_models.bem.BEM.predict method. This allows users to define key word arguments that can be passed to the sub word tokenizer that is downloaded from HuggingFace through transformers.AutoTokenizer.from_pretrained.
  • Added a ValueError that is raised within the wsd_torch_models.bem.BEM.predict when the number of predicted sense labels does not equal the number of tokens that were given that should have a predicted sense label.
  • Added add_prefix_space=True argument to the AutoTokenizer.from_pretrained method for all examples in the README.md, scripts/convert_and_upload_bem_model.py, and model_readmes/pymusas_bem.md. This is required as this is what the pre-trained BEM models expect.
  • The devcontainers, found in .devcontainer, have been improved so that they use the cached uv packages that have been installed at docker build time.

Commits

0bdd512 Correct file path to model checkpoint for English Small BEM
846ab0a Example using the larger model
86cb8b1 Correct spelling error of Engish on the model names

v0.1.0

02 Dec 11:43
v0.1.0
5126326

Choose a tag to compare

What's new

Added 🎉

  • First release.
  • The Bi-Encoder Model (BEM) from the paper Moving Down the Long Tail of Word Sense Disambiguation with Gloss Informed Bi-encoders. This model can be found at wsd_torch_models.bem.BEM
  • The wsd_torch_models.bem.BEM class represents a good potential blueprint (abstract class) for other Word Sense Disambiguation methods to inherit from in the future through a parent class.
  • Created a script, scripts/convert_and_upload_bem_model.py, that converts Pytorch Lightning models that the wsd_torch_models.bem.BEM class was created from to be converted into the Pytorch and PyTorchModelHubMixin class that the wsd_torch_models.bem.BEM class represents without the need for Pytorch Lightning dependency. This script only requires the checkpoint from the saved Pytorch Lightning model and it will convert the model as well as upload it to the relevant HuggingFace hub repository.

Commits

5126326 Prepare for release v0.1.0
17ac105 Creation of release notes and scripts
68e8358 Preparing for first release
9de65a2 Adding all models to HuggingFace Hub
2bbfe29 more relevant naming
4d3a44f Added arguments to only update parts of the model
33800e8 Added arguments to only update parts of the model
855d634 Update README.md
99e085f Model inference code
56034cf End of day
dbd663a Installation
dae15cf USAS mapper
1777a47 Testing attention masking on forward pass
ce53edc Small test for the BEM model
ee956f8 Added doc strings
5a56b0c Number of parameters test
30c7354 Added pytests to the CI pipeline
1be0b9d Debugging saved files
141b5a0 Debugging saved files
842c234 Debugging saved files
b6625a6 Debugging saved files
ac4281a Debugging saved files
56d0076 Debugging saved files
dc06b7f Debugging saved files
af10ef5 Debugging saved files
19e78fe Debugging saved files
b96b36c Debugging saved files
06bcbf1 Debugging saved files
ce38168 Changed caching
82ec8db Model caching for testing
a84dce7 list files removed as it does not work on Windows
dd713df Removed scripts path from flake8 and isort
335bc74 Syntax error
e7d509c HF model caching
0c8dedd vscode
4682e73 CPU and GPU containers
743f19e end of day
df93fe5 CI
3327e7e Documentation
d38b5d0 Initital code
ff68f0f Project setup
312ca60 Dev Container setup