Releases: mosaicml/llm-foundry
v0.19.0
What's New
1. Python 3.12 Bump (#1755)
We've added support for Python 3.12 and deprecated Python 3.9 support.
What's Changed
- Use llmfoundry image instead of pytorch image for gpu tests by @rithwik-db in #1752
- bump dev version to 0.19.0.dev0 by @rithwik-db in #1753
- Bump mcli yaml examples to use 0.18.0 and torch 2.6 by @rithwik-db in #1754
- Fix meta initialization for FSDP training with HF models and TE Layers by @jjuvonen-amd in #1745
- Fix bugs in
llmfoundry/data/text_data.py
by @gsganden in #1760 - Update setuptools requirement from <76.0.0 to <78.0.0 by @dependabot in #1758
- Update README.md by @gsganden in #1721
- Add error handling for general table download errors by @dakinggg in #1761
- modified the packing slightly to enable inheritance by @abaheti95 in #1762
- Remove registration fallback by @dakinggg in #1764
- Move save/load planner creation to after config logging by @dakinggg in #1769
- Bump Python 3.12 by @KuuCi in #1755
- Fix GPU Tests 3.10 by @KuuCi in #1770
- Remove a bunch of repeated calls to HF in the tests by @dakinggg in #1768
- Bump coverage[toml] from 7.6.10 to 7.8.0 by @dependabot in #1767
- Update mlflow requirement from <2.19,>=2.14.1 to >=2.14.1,<2.22 by @dependabot in #1766
- Bump Composer 0.30.0 by @KuuCi in #1772
- Bump streaming 0.12.0 by @KuuCi in #1777
New Contributors
- @jjuvonen-amd made their first contribution in #1745
- @gsganden made their first contribution in #1760
- @abaheti95 made their first contribution in #1762
Full Changelog: v0.18.0...v0.19.0
v0.18.0
What's Changed
- Torch has been bumped to
2.6.0
(in #1740)- Sparse support has been disabled in the latest megablocks version (as part of the latest torch upgrade) and we cascaded those disables to llm-foundry as well (for more details, view the megablocks release)
TransformerEngine
has been removed from theall
dependency group due to version compatibility issues (in #1742). We expect to add this back in a future release.- Transformers has been bumped to
v4.49.0
(in #1735) and this would result in the master weights beingtorch.bfloat16
(view huggingface/transformers#36567 for more context).llm-foundry
doesn't support master weights in lower precision, so we manually hardcoded this totorch.float32
when loading in #1734.
Detailed Changes
- remove deprecated param by @bigning in #1727
- Bump TE for FA 2.7.1.post1 bump by @KuuCi in #1730
- Fix dtype issue in transformers by @dakinggg in #1734
- Bump composer to 0.29.0 by @rithwik-db in #1733
- Bump Transformer v4.49.0 by @KuuCi in #1735
- Bump FA2 to 2.7.4.post1 by @KuuCi in #1728
- Comment GHCR Image Upload by @KuuCi in #1739
- Remove TE from all dependency group by @dakinggg in #1742
- Bump torch to 2.6 by @rithwik-db in #1740
- Update Makefile to use WORLD_SIZE by @irenedea in #1751
New Contributors
- @rithwik-db made their first contribution in #1733
Full Changelog: v0.17.1...v0.18.0
v0.17.1
What's New
Datasets version upgrade (#1724)
We've upgraded the version of Hugging Face datasets library to include a fix for a common issue of the multiprocessing pool hanging after tokenization or filtering.
What's Changed
- Update accelerate requirement from <1.2,>=0.25 to >=0.25,<1.4 by @dependabot in #1714
- Bump datasets version by @dakinggg in #1724
Full Changelog: v0.17.0...v0.17.1
v0.17.0
What's Changed
- Update mcli examples to use 0.16.0 by @irenedea in #1713
- Refactor HF checkpointer by @milocress in #1690
Previously, MlFlow required PEFT models to be specified as a special "flavor" distinct from Transformers models. This workaround is no longer necessary, allowing us to simplify the codepath and cleanly abstract uploading the HuggingFace checkpoints from registering trained models. - Bump version to 0.18.0.dev by @milocress in #1717
Removes the deprecatedsample_weighing_factor
argument frommpt
loss calculations.
Full Changelog: v0.16.0...v0.17.0
v0.16.0
What's New
Streaming 0.11.0 🚀 (#1711)
We've upgraded streaming to 0.11.0. StreamingDataset can now be used with custom Stream implementations via a registry. See the documentation page for example usage.
What's Changed
- Fix llama3 example yamls by @j316chuck in #1688
- Update example yamls to use newest foundry version by @snarayan21 in #1689
- Update datasets requirement from <2.21,>=2.20.0 to >=2.20.0,<3.2 by @dependabot in #1670
- Catch multiple slashes in source dataset into one slash by @KuuCi in #1697
- Make loaded peft adapters optionally trainable by @snarayan21 in #1701
- Adding preprocessors for QA and messages datasets by @ShashankMosaicML in #1700
- Update pycln by @b-chu in #1704
- Add permission error by @b-chu in #1703
- Update datasets requirement from <3.2,>=2.20.0 to >=2.20.0,<3.3 by @dependabot in #1698
- Bump coverage[toml] from 7.6.4 to 7.6.10 by @dependabot in #1702
- Update mosaicml-streaming to 0.11.0 by @es94129 in #1711
- Bump version to 0.17.0.dev0 by @irenedea in #1712
Full Changelog: v0.15.1...v0.16.0
v0.15.1
What's Changed
- Bump version 0.16.0.dev0 by @j316chuck in #1667
- Update mlflow requirement from <2.18,>=2.14.1 to >=2.14.1,<2.19 by @dependabot in #1673
- Speed up embedding tests by @dakinggg in #1668
- Add mcli yaml version bump by @j316chuck in #1674
- Bump Openai version by @snarayan21 in #1684
- Bump Streaming to v0.10.0 by @snarayan21 in #1685
- Bugfix auto packing with streams + no remote path by @mattyding in #1679
- Bump Composer to v0.28.0 by @snarayan21 in #1687
- Expose
DistributedSampler
RNG seed argument by @janEbert in #1677 - Add llama3 ft example yamls by @j316chuck in #1686
New Contributors
Full Changelog: v0.15.0...v0.15.1
v0.15.0
New Features
Open Source Embedding + Contrastive Code (#1615)
LLM foundry now supports finetuning embedding models with contrastive loss. Foundry now supports various approaches to selecting negative passages for contrastive loss which can be either randomly selected or pre-defined. For more information, please view the the readme.
PyTorch 2.5.1 (#1665)
This release updates LLM Foundry to the PyTorch 2.5.1 release, bringing with it support for the new features and optimizations in PyTorch 2.5.1.
Improved error messages (#1657, #1660, #1623, #1625)
Various improved error messages, making debugging user errors more clear.
What's Changed
- Update mcli examples to use 0.14.0 by @irenedea in #1624
- Open Source Embedding + Contrastive Code by @KuuCi in #1615
- Catch delta table not found error by @milocress in #1625
- Add Mlflow 403 PL UserError by @mattyding in #1623
- Catches when data prep cluster fails to start by @milocress in #1628
- Bump mlflow max version by @dakinggg in #1629
- add another cluster connection failure wrapper by @milocress in #1630
- Add MLflow
log_model
option by @nancyhung in #1544 - Move loss generating token counting to the dataloader by @dakinggg in #1632
- Bump databricks-connect from 14.1.0 to 15.4.3 by @dependabot in #1636
- Fix dataset download location by @dakinggg in #1639
- Revert "Bump databricks-connect from 14.1.0 to 15.4.3" by @XiaohanZhangCMU in #1640
- Bump transformers version by @dakinggg in #1631
- Fix gpu tests test_tp_train and test_huggingface_conversion_callback_interval by @irenedea in #1642
- Update datasets requirement from <2.20,>=2.19 to >=2.20.0,<2.21 by @dependabot in #1330
- Add max shard size to transformers save_pretrained by @b-chu in #1648
- Update huggingface-hub requirement from <0.25,>=0.19.0 to >=0.19.0,<0.27 by @dependabot in #1652
- Update accelerate requirement from <0.34,>=0.25 to >=0.25,<1.2 by @dependabot in #1633
- Catch Delta Table Not Found by @KuuCi in #1653
- Add Exception for missing UC column by @milocress in #1654
- Infer step size for Embeddings by @KuuCi in #1647
- Pin FAv2 by @mvpatel2000 in #1656
- Retry catching BlockingIOError by @KuuCi in #1657
- Catch bad data prep by @milocress in #1644
- Update pytest-cov requirement from <6,>=4 to >=4,<7 by @dependabot in #1663
- Bump coverage[toml] from 7.6.1 to 7.6.4 by @dependabot in #1650
- Move transform_model_pre_registration in hf_checkpointer by @irenedea in #1664
- Catch Cluster Permissions Error by @KuuCi in #1660
- Mosaicml version bump by @j316chuck in #1661
- Changes for removing unused terms in CE loss fn by @gupta-abhay in #1643
- Update setuptools requirement from <68.0.0 to <76.0.0 by @dependabot in #1662
- Bump docker version to torch 2.5.1 by @j316chuck in #1665
- Bump ubuntu 22.04 + torch 2.5.1 by @KuuCi in #1666
New Contributors
- @mattyding made their first contribution in #1623
Full Changelog: v0.14.5...v0.15.0