v0.18.0
What's Changed
- Torch has been bumped to
2.6.0
(in #1740)- Sparse support has been disabled in the latest megablocks version (as part of the latest torch upgrade) and we cascaded those disables to llm-foundry as well (for more details, view the megablocks release)
TransformerEngine
has been removed from theall
dependency group due to version compatibility issues (in #1742). We expect to add this back in a future release.- Transformers has been bumped to
v4.49.0
(in #1735) and this would result in the master weights beingtorch.bfloat16
(view huggingface/transformers#36567 for more context).llm-foundry
doesn't support master weights in lower precision, so we manually hardcoded this totorch.float32
when loading in #1734.
Detailed Changes
- remove deprecated param by @bigning in #1727
- Bump TE for FA 2.7.1.post1 bump by @KuuCi in #1730
- Fix dtype issue in transformers by @dakinggg in #1734
- Bump composer to 0.29.0 by @rithwik-db in #1733
- Bump Transformer v4.49.0 by @KuuCi in #1735
- Bump FA2 to 2.7.4.post1 by @KuuCi in #1728
- Comment GHCR Image Upload by @KuuCi in #1739
- Remove TE from all dependency group by @dakinggg in #1742
- Bump torch to 2.6 by @rithwik-db in #1740
- Update Makefile to use WORLD_SIZE by @irenedea in #1751
New Contributors
- @rithwik-db made their first contribution in #1733
Full Changelog: v0.17.1...v0.18.0