Releases · TransformerLensOrg/TransformerLens

07 Sep 22:51

bryce13950

v3.0.0a8

ee9b44b

v3.0.0a8 Pre-release

Pre-release

Another update that rounds out the API for our new module

What's Changed

created new base config class by @bryce13950 in #1042
made sure to check for nested hooks by @bryce13950 in #1035
Fix warning for aliases when compatibility mode is turned off by @degenfabian in #1041
Feature kv cache by @bryce13950 in #1045
Split weights instead of logits for models with joint QKV matrix by @degenfabian in #1043

Full Changelog: v3.0.0a7...v3.0.0a8

Contributors

bryce13950 and degenfabian

Assets 2

28 Aug 22:00

bryce13950

v3.0.0a7

446b9d0

v3.0.0a7 Pre-release

Pre-release

What's Changed

map hook_pos_embed to rotary_emb, allow hook_aliases to be a list by @hijohnnylin in #1034

Full Changelog: v3.0.0a6...v3.0.0a7

Contributors

hijohnnylin

Assets 2

26 Aug 20:16

bryce13950

v3.0.0a6

285c039

v3.0.0a6 Pre-release

Pre-release

Big Release! A whole bunch of optimizations, and second passes on certain parts of the TransformerBridge to get us closer to beta.

What's Changed

added setters and hook utils to bridge by @bryce13950 in #1009
updated property access by @bryce13950 in #1026
feat: Bridge.boot should allow using alias model names, but show a deprecation warning by @hijohnnylin in #1028
Move QKV separation into bridge that wraps QKV matrix by @degenfabian in #1027
removed unnecessary import by @bryce13950 in #1030
Attn pattern shape by @bryce13950 in #1029
added cache layer for hook collection by @bryce13950 in #1032
Bridge unit test compatibility coverage by @bryce13950 in #1031
updated loading in interactive neuroscope demo to use transformer bridge by @degenfabian in #1017

New Contributors

@hijohnnylin made their first contribution in #1028

Full Changelog: v3.0.0a5...v3.0.0a6

Contributors

hijohnnylin, bryce13950, and degenfabian

Assets 2

16 Aug 09:29

bryce13950

v3.0.0a5

4ece3c4

v3.0.0a5 Pre-release

Pre-release

First new architecture for the TransformerBridge, and a whole lot closer to beta!

What's Changed

Weight conversion renaming by @bryce13950 in #996
Attention shape normalization by @bryce13950 in #997
Joint hook handling by @bryce13950 in #1001
Add compatibility_mode feature by @degenfabian in #998
Add support for GPT-OSS by @degenfabian in #1004
Fix GPT-OSS initialization error by @degenfabian in #1007

Full Changelog: v3.0.0a4...v3.0.0a5

Contributors

bryce13950 and degenfabian

Assets 2

05 Aug 15:37

bryce13950

v3.0.0a4

5699d73

v3.0.0a4 Pre-release

Pre-release

Big update that brings us a lot closer to beta! This update adds a compatibility layer for a lot of legacy properties of the old hooked root modules.

What's Changed

Unified aliases by @bryce13950 in #991
fixed hook alias positions by @bryce13950 in #992
Create bridge for every module in Mixtral by @degenfabian in #984
removed numpy ceiling by @bryce13950 in #994
Ensure hook and property backwards compatibility with HookedTransformer by @degenfabian in #990
Create bridge for every module in neox by @degenfabian in #995
Create bridges for every module in neo by @degenfabian in #987

Full Changelog: v3.0.0a3...v3.0.0a4

Contributors

bryce13950 and degenfabian

Assets 2

27 Jul 21:04

bryce13950

v3.0.0a3

b161fd5

v3.0.0a3 Pre-release

Pre-release

New Alpha release! A whole bunch of changes have been added. Some more HookedTransformer functionality has been imported, and a whole bunch of architectures have been improved to give more options in our new module. These changes have resulted in a very noticeable improvement with compatibility of old HookedTransformer based code.

What's Changed

Setup deprecated hook aliases and got the majority of the main demo running properly by @bryce13950 in #976
Linear test coverage by @bryce13950 in #977
Create Bridge for every Gemma 3 module by @degenfabian in #966
Add Bridges for every module in GPT2 by @degenfabian in #967
Cache hook aliases & stop at layer by @bryce13950 in #978
Create Bridges for every module in Bloom models by @degenfabian in #970
Create Bridges for every module in Gemma 2 by @degenfabian in #971
Create bridges for every module in Gemma 1 by @degenfabian in #972
Create bridges for every module in Mistral by @degenfabian in #979
Remove that output_attention flag defaults to true in boot function by @degenfabian in #982
Create bridge for every module in GPT-J by @degenfabian in #974
Create bridge for every module in Llama by @degenfabian in #975

Full Changelog: v3.0.0a2...v3.0.0a3

Contributors

bryce13950 and degenfabian

Assets 2

21 Jul 22:04

bryce13950

v3.0.0a2

bfb8626

v3.0.0a2 Pre-release

Pre-release

This release is inconsequential. The first alpha release showed that the CI was not capable of publishing to pip with pep style alpha tags. This release makes that possible. Please consult the release notes for v3.0.0a1 for full information on 3.x alpha.

What's Changed

Pre release version publishing by @bryce13950 in #973

Full Changelog: v3.0.0a1...v3.0.0a2

Contributors

bryce13950

Assets 2

18 Jul 17:29

bryce13950

v3.0.0a1

a84ce55

v3.0.0a1 Pre-release

Pre-release

Big release coming up! This release will add a new module named TransformerBridge, which will greatly increase flexibility and expandability of TransformerLens. This is a very experimental module right now, but we are looking for people who are ready to test it. This version already supports more models than any of the existing HookedModules, and we are in the middle of working through a number of scripts to assure full compatibility with any existing code utilized any of those HookedModules.

If you are interested in helping as test some of this, let us know on the slack channel! If you want to be able to use any models not currently supported in HookedModules, then please feel free to submit any scripts currently running with an existing HookedModule to https://github.com/TransformerLensOrg/BridgeComaptibilityScripts. All scripts in this repo will be confirmed to be running, and matching the current HookedTransformer output before the final 3.0.0 release is published.

What's Changed

Refactor the utilities file into utilities folder by @starship006 in #628
Raise exception when BERT is loaded with HookedTransformer instead of… by @degenfabian in #795
Circular dependency resolution by @bryce13950 in #803
fixed corner param by @bryce13950 in #817
bumped python min version by @bryce13950 in #802
Updates torch to use the most recent version by @bryce13950 in #822
updated python requirements by @bryce13950 in #821
Recent releases by @bryce13950 in #841
updated mypy limit by @bryce13950 in #880
Activation utils cleanup by @bryce13950 in #879
Restore consistency of hook_normalized between LayerNorm and RMSNorm by @degenfabian in #770
Fix that padding_side always defaults to "right" when no value is explicitly passed by @degenfabian in #814
Unified conversions by @bryce13950 in #881
Flatten state dictionary for proper weight loading by @degenfabian in #860
enabled actions on action pr by @bryce13950 in #882
Add weight conversion for Phi model by @degenfabian in #863
Add weight conversion for T5 models by @degenfabian in #859
Visualize weight conversions by @degenfabian in #852
Fixed test for ensuring weight conversions are provided by @bryce13950 in #883
Drop python 3.9 by @bryce13950 in #885
Conversion improved test coverage by @bryce13950 in #886
Component test coverage by @bryce13950 in #890
Bug new loading by @bryce13950 in #891
Weight conversion llama by @bryce13950 in #892
Refactor supported models module by @bryce13950 in #893
Bug neox by @bryce13950 in #895
added conditional check for hugging face by @bryce13950 in #919
created a seperate list of models to test for public PRs by @bryce13950 in #920
added alternative when hf token is not included by @bryce13950 in #921
shrunk loss test by @bryce13950 in #922
Fix broken test, per issue #913 by @JasonBenn in #914
Fix loading on specific device by @mntss in #906
Feature model adapter by @bryce13950 in #928
added test for making sure formatting works well by @bryce13950 in #932
Refactor final issues by @bryce13950 in #933
restored tokenizer content by @bryce13950 in #935
Refactor weight conversion by @bryce13950 in #931
Add qwen3 by @mntss in #937
Improve ActivationCache docs by @BorisTheBrave in #901
Feature: Get the value for rotary base from the hugging face config, only for Qwen for now. by @Gusanidas in #887
added python 3.13 to CI by @bryce13950 in #843
updated mypy by @bryce13950 in #940
updated numpy dependency by @bryce13950 in #943
upated torch by @bryce13950 in #942
updated transformers by @bryce13950 in #939
Fixed Qwen 3 docs issues by @bryce13950 in #946
upstream fixes from dev by @bryce13950 in #941
Flexible component mapping by @bryce13950 in #938
updated sphinx by @bryce13950 in #948
removed dependency by @bryce13950 in #951
Move flatten dictionary to architecture_conversion by @degenfabian in #936
made new transformer bridge extend nn module properly by @bryce13950 in #955
brought in remaining hooked transformer functions by @bryce13950 in #954
Setup tokenizer in boot function by @degenfabian in #959
Bridged Robust Model Structure by @bryce13950 in #960
Remove transformers dependency from bridge tokenization by @degenfabian in #963
Dynamically add boot function to bridge by @degenfabian in #964

New Contributors

@JasonBenn made their first contribution in #914
@BorisTheBrave made their first contribution in #901
@Gusanidas made their first contribution in #887

Full Changelog: v2.15.4...v3.0.0a1

Contributors

BorisTheBrave, JasonBenn, and 5 other contributors

Assets 2

19 Jun 13:20

bryce13950

v2.16.1

a634e57

v2.16.1 Latest

Latest

Minor release with security dependency bumps

What's Changed

updated sphinx by @bryce13950 in #948
removed dependency by @bryce13950 in #951

Full Changelog: v2.16.0...v2.16.1

Contributors

bryce13950

Assets 2

12 Jun 18:19

bryce13950

v2.16.0

e1c7506

v2.16.0

What's Changed

added conditional check for hugging face by @bryce13950 in #919
created a seperate list of models to test for public PRs by @bryce13950 in #920
added alternative when hf token is not included by @bryce13950 in #921
shrunk loss test by @bryce13950 in #922
Fix broken test, per issue #913 by @JasonBenn in #914
Fix loading on specific device by @mntss in #906
Add qwen3 by @mntss in #937
Improve ActivationCache docs by @BorisTheBrave in #901
Feature: Get the value for rotary base from the hugging face config, only for Qwen for now. by @Gusanidas in #887
updated mypy by @bryce13950 in #940
updated numpy dependency by @bryce13950 in #943
upated torch by @bryce13950 in #942
updated transformers by @bryce13950 in #939
Fixed Qwen 3 docs issues by @bryce13950 in #946

New Contributors

@JasonBenn made their first contribution in #914
@BorisTheBrave made their first contribution in #901
@Gusanidas made their first contribution in #887

Full Changelog: v2.15.4...v2.16.0

Contributors

BorisTheBrave, JasonBenn, and 3 other contributors

Assets 2

Releases: TransformerLensOrg/TransformerLens

v3.0.0a8

What's Changed

Contributors

Uh oh!

v3.0.0a7

What's Changed

Contributors

Uh oh!

v3.0.0a6

What's Changed

New Contributors

Contributors

Uh oh!

v3.0.0a5

What's Changed

Contributors

Uh oh!

v3.0.0a4

What's Changed

Contributors

Uh oh!

v3.0.0a3

What's Changed

Contributors

Uh oh!

v3.0.0a2

What's Changed

Contributors

Uh oh!

v3.0.0a1

What's Changed

New Contributors

Contributors

Uh oh!

v2.16.1

What's Changed

Contributors

Uh oh!

v2.16.0

What's Changed

New Contributors

Contributors

Uh oh!