Releases: TransformerLensOrg/TransformerLens
v2.15.0
Nice little update! This one improves compatibility for LLaMA 3.3 70B, adds a new mistral mode, and introduces a number of utilities for Bert.
What's Changed
- Fixes compatibility with Llama 3.3 70B by @thisnick in #856
- Extend Bert support by @degenfabian in #829
- fixed bert indenting by @bryce13950 in #875
New Contributors
Full Changelog: v2.14.1...v2.15.0
v2.14.1
Reenables support for most recent version ofTransformerLens
What's Changed
- removed torch ceiling by @bryce13950 in #865
Full Changelog: v2.14.0...v2.14.1
v2.14.0
Much more robust, but still experimental multi-GPU support!
What's Changed
- moved setup python by @bryce13950 in #855
- Refactor device selection by @bryce13950 in #864
Full Changelog: v2.13.0...v2.14.0
v2.13.0
Nice little maintenance one, and a large expansion of generation to allow support for vision models!
What's Changed
- Upstream update by @bryce13950 in #840
- Manually create repr for partial hooks by @danbraunai in #845
- updated artifacts version by @bryce13950 in #850
- Upgrade transformers by @bryce13950 in #849
- Ci hf token empty by @bryce13950 in #853
- Add LLaVA support, modify generate function by @zazamrykh in #820
- Ci hf secret by @bryce13950 in #854
New Contributors
- @danbraunai made their first contribution in #845
- @zazamrykh made their first contribution in #820
Full Changelog: v2.12.0...v2.13.0
v2.12.0
What's Changed
- updated lock command by @bryce13950 in #831
- Extend support for T5 models by @degenfabian in #832
- Added model Phi 4 by @jonasrohw in #833
- Phi 4 docs fix by @bryce13950 in #839
Full Changelog: v2.11.0...v2.12.0
v2.11.0
LLaMA 3.3 support! This release also includes a handful of usability improvements.
What's Changed
- Set prepend_bos to false by default for Qwen models by @degenfabian in #815
- Throw error when using attn_in with grouped query attention by @degenfabian in #810
- Feature llama 33 by @bryce13950 in #826
Full Changelog: v2.10.0...v2.11.0
v2.10.0
Huge update! This is likely going to be the last big 2.x update. This update greatly improves model implementation accuracy, and adds some of the newer Qwen models.
What's Changed
- Remove einsum in forward pass in AbstractAttention by @degenfabian in #783
- Colab compatibility bug fixes by @degenfabian in #794
- Remove einsum usage from create_alibi_bias function by @degenfabian in #781
- Actions token access by @bryce13950 in #797
- Remove einsum in apply_causal_mask in abstract_attention.py by @degenfabian in #782
- clarified arguments a bit for hook_points by @bryce13950 in #799
- Remove einsum in logit_attrs in ActivationCache by @degenfabian in #788
- Remove einsum in compute_head_results in ActivationCache by @degenfabian in #789
- Remove einsum usage in refactor_factored_attn_matrices in HookedTransformer by @degenfabian in #791
- Remove einsum usage in _get_w_in_matrix in SVDInterpreter by @degenfabian in #792
- Remove einsum usage in forward function of BertMLMHead by @degenfabian in #793
- Set default_prepend_bos to False in Bloom model configuration by @degenfabian in #806
- Remove einsum in complex_attn_linear by @degenfabian in #790
- Add a demo of collecting activations from a single location in the model. by @adamkarvonen in #807
- Add support for Qwen_with_Questions by @degenfabian in #811
- Added support for Qwen2.5 by @israel-adewuyi in #809
- Updated devcontainers to use python3.11 by @jonasrohw in #812
New Contributors
- @israel-adewuyi made their first contribution in #809
- @jonasrohw made their first contribution in #812
Full Changelog: v2.9.1...v2.10.0
v2.9.1
Minor dependency change to address a change in an outside dependency
What's Changed
- added typeguard dependency by @bryce13950 in #786
Full Changelog: v2.9.0...v2.9.1
v2.9.0
Lot's of accuracy improvements! A number of models are behaving closer to how they behave in Transformers, and a new internal configuration has been added to allow for more ease of use!
What's Changed
- fix the bug that attention_mask and past_kv_cache cannot work together by @yzhhr in #772
- Set prepend_bos to false by default for Bloom model family by @degenfabian in #775
- Fix that if use_past_kv_cache is set to True models from the Bloom family produce weird outputs. by @degenfabian in #777
New Contributors
- @yzhhr made their first contribution in #772
- @degenfabian made their first contribution in #775
Full Changelog: v2.8.1...v2.9.0
v2.8.1
New notebook for comparing models, and bug fix with dealing with newer LLaMA models!
What's Changed
- Logit comparator tool by @curt-tigges in #765
- Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1 series by @Hzfinfdu in #764
New Contributors
Full Changelog: v2.8.0...v2.8.1