Releases · MinishLab/model2vec · GitHub

05 Oct 06:30

Pringled

v0.7.0 Latest

Latest

What's Changed

add support for passing weight to the loss functions by @volker48 in #260
fix: padding token not recognized, update transformers by @stephantul in #265
Fix tag train documentation by @Lhemamou in #269
chore: Added python 3.13 to pyproject and CI by @Pringled in #270
feat: add classifier freezing by @stephantul in #274
fix: remove windows tests by @stephantul in #277
feat: add configurable pad token by @stephantul in #276
feat: faster loading if model already cached by @stephantul in #278
feat: add vocabulary quantization by @stephantul in #271
fix: load faster, make quantization better by @stephantul in #279
fix: F rule, A rule, update ruff by @stephantul in #281
feat: Added embedding_dtype and vocabulary_quantization to config by @Pringled in #280
fix: Disable MPS for Torch versions >=2.8.0 by @Pringled in #287
feat: Add configurable pooling for distillation by @Pringled in #288
chore: Deprecate apply_zipf and use_subword parameters by @Pringled in #289
chore: Rename PoolingType to PoolingMode by @Pringled in #290
docs: Update main docs by @Pringled in #291
chore: Bump version by @Pringled in #292

New Contributors

@volker48 made their first contribution in #260
@Lhemamou made their first contribution in #269

Deprecation warnings ⚠️

apply_zipf and use_subword are now officially deprecated from distill

Full Changelog: v0.6.0...v0.7.0

Contributors

volker48, stephantul, and 2 other contributors

Assets 2

03 Jun 10:01

stephantul

0.6.0

What's Changed

docs: update chonkie link on tutorial readme by @iaurg in #235
Fix dates in README.md by @Pringled in #238
fix: add default arg for push_to_hub by @stephantul in #240
fix: remove direct dependency on specific hf utils by @stephantul in #244
feat: smaller tokenizers by @stephantul in #243
feat: update lock by @stephantul in #246
feat: allow passing validation set explicitly by @JarbasAl in #245
docs: Added multilingual results by @Pringled in #247
fix: distillation for models without card by @JarbasAl in #248
feat: add supertokenizers by @stephantul in #236
clean-up print statement by @stephantul in #249
fix: small typing issue by @stephantul in #250
docs: Added new logo by @Pringled in #252
fix: missing unk, fix bug by @stephantul in #251
bump version by @stephantul in #258
feat: make normalization dependent on spacing by @stephantul in #259

New Contributors

@iaurg made their first contribution in #235
@JarbasAl made their first contribution in #245

Full Changelog: v0.5.0...v0.6.0

Contributors

stephantul, iaurg, and 2 other contributors

Assets 2

30 Apr 18:01

stephantul

v0.5.0

What's Changed

fix: Updated semantic chunking tutorial by @bhavnicksm in #205
rewrite backend by @stephantul in #207
fix bibtex by @stephantul in #208
feat: Added py.typed file by @Pringled in #214
fix: pretokenize tokens before checking vocabulary by @stephantul in #215
feat: add dimensionality during loading by @stephantul in #216
feat: add quantization by @stephantul in #217
feat: save load subfolder by @stephantul in #218
feat: Added quantization for from_sentence_transformers by @Pringled in #219
feat: faster inference for large vocab by @stephantul in #221
feat: track token provenance by @stephantul in #222
fix: typing issues, bug in infernece by @stephantul in #224
fix: issues with unk and pad by @stephantul in #225
bug: fix 0 score in evaluate by @stephantul in #226
fix: precision during training by @stephantul in #228
fix: issue with unk in unigram by @stephantul in #227
docs: add info about quantization and dimensionality reduction by @stephantul in #231
increment version by @stephantul in #232

New Contributors

@bhavnicksm made their first contribution in #205

Full Changelog: 0.4.1...v0.5.0

Contributors

stephantul, bhavnicksm, and Pringled

Assets 2

28 Feb 07:26

Pringled

0.4.1

What's Changed

docs: Added training plot, added more training results by @Pringled in #189
feat: Added min and max epochs to fit by @Pringled in #190
docs: Update model card template by @Pringled in #192
feat: Add multilabel classification for training by @Pringled in #191
feat: Add evaluate function for classifiers by @Pringled in #195
docs: Added discord badge by @Pringled in #193
fix: only allows named args in pretrain by @stephantul in #200
Bump version by @Pringled in #204

Full Changelog: 0.4.0...0.4.1

Contributors

stephantul and Pringled

Assets 2

12 Feb 19:48

Pringled

0.4.0

What's Changed

Add fittable by @stephantul in #140
fix scores in readme by @stephantul in #179
docs: Refactored main docs, added separate docs directory, added training docs by @Pringled in #181
docs: Update README.md by @Pringled in #183
Update README.md by @Pringled in #184
feat: replace 8m by 32m for training by @stephantul in #182
docs: update scores in README by @stephantul in #186
docs: Moved training results to results directory, updated docs and description by @Pringled in #187
Bump version by @Pringled in #188

Full Changelog: v0.3.9...0.4.0

Contributors

stephantul and Pringled

Assets 2

06 Feb 07:09

stephantul

v0.3.9

What's Changed

docs: Added new model results by @Pringled in #167
docs: Update plot by @Pringled in #169
feat: add trust-remote-code option by @stephantul in #173
feat: Add SIF-like coef by @stephantul in #174
increase version by @stephantul in #176

Full Changelog: v0.3.8...v0.3.9

Contributors

stephantul and Pringled

Assets 2

27 Jan 18:39

stephantul

v0.3.8

What's Changed

docs: fix docstrings in distill by @stephantul in #157
remove unnecessary import by @stephantul in #161
remove deduplication tutorial by @stephantul in #159
fix: issue with modernbert tokenizer, add token pattern to _distill by @stephantul in #158
fix: fix typing issue by @stephantul in #162
feat: float pca dims by @stephantul in #163
feat: Add optional embedding normalization to StaticModel loading by @davidberenstein1957 in #164
feat: Improve distill for modernBERT by @stephantul in #165
increase version by @stephantul in #166

New Contributors

@davidberenstein1957 made their first contribution in #164

Full Changelog: v0.3.7...v0.3.8

Contributors

stephantul and davidberenstein1957

Assets 2

21 Jan 20:01

Pringled

v0.3.7

What's Changed

feat: Updated save_pretrained to save sentence-transformers compatible models by @Pringled in #154
Bump version by @Pringled in #155

Full Changelog: v0.3.6...v0.3.7

Contributors

Pringled

Assets 2

15 Jan 19:54

Pringled

v0.3.6

What's Changed

Add loading from st by @stephantul in #151
Bump version by @Pringled in #152

Full Changelog: v0.3.5...v0.3.6

Contributors

stephantul and Pringled

Assets 2

11 Jan 10:37

Pringled

v0.3.5

What's Changed

fix: Fixed local distillation by @Pringled in #149
Bump version by @Pringled in #150

Full Changelog: v0.3.4...v0.3.5

Contributors

Pringled

Assets 2