Releases: decoderesearch/SAELens
v6.1.0
v6.1.0 (2025-07-16)
Chore
-
chore: add more test coverage (#504)
-
chore: more tests for pretrained_sae_loaders
-
adding some more tests for llm_sae_training_runner evals
-
more load_model tests
-
adding more tests
-
adding more tests and removing unused methods
-
more minor tests
-
deleting more unused code
-
deleting more unused code and adding more tests
-
tweaking equivalence test (
f4ba87a)
Feature
-
feat: BatchTopK SAE training (#505)
-
feat: BatchTopK SAE training
-
Update sae_lens/saes/batchtopk_sae.py
Co-authored-by: Copilot <[email protected]>
-
removing unused post-act fn from topk/batchtopk
-
removing mention of 'noise' from comments
-
adding batchtopk to ALL_ARCHITECTURES
-
fixing tests / simplifying config munging
-
more docs tweaks
Co-authored-by: Copilot <[email protected]> (7b7c283)
Unknown
v6.0.0
v6.0.0 (2025-07-14)
Breaking
- feat: v6.0.0 release
BREAKING CHANGE: v6 release (6dbb447)
Unknown
-
SAELens v6 (#500)
-
Reorganized folders
-
fixing toolkit paths
-
First draft of base and standard SAEs
-
Working implementation of separated standard SAE
-
working, tested equivalents to old sae types
-
training SAE wrapper now passing tests
-
SAEs passing most tests
-
All tests passing with restructured SAE classes
-
HookedSAETransformer now passing tests
-
formatting fixes
-
importing changes to topk SAE aux loss from #432
-
extracts logging fields to class (#444)
-
extracts logging fields to class
-
applies make format
-
fixes comment
-
changes default value for wandb_project
-
fixing equivalence tests
-
getting tests passing again
-
removing proxy classes
-
fixing type errors
-
move tests around
-
allow running deploy on alpha and beta branches
BREAKING CHANGE: incompatible config structure
-
updating workflow to run on pushes to alpha and beta
-
updating semantic release config to hopefully run alpha/beta builds
-
feat: refactor config
BREAKING CHANGE: refactor config options
- 6.0.0-rc.1
Automatically generated by python-semantic-release
-
Refactor arch configs (#468)
-
wip: refactoring architecture configs
-
move to separate config classes per architecture
-
wip: fixing tests
-
working on getting more tests passing
-
fixing more tests / typing issues
-
adding support for converting from old config formats
-
fixing more tests
-
fixing more tests
-
fixing more tests and typings
-
fixing more tests
-
fixing linting
-
ensuring runner config is also uploaded if available
-
fix init to match old heuristic init
-
Update sae_lens/config.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/config.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/training/sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/refactor_compatibility/test_jumprelu_sae_equivalence.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/refactor_compatibility/test_jumprelu_sae_equivalence.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/util.py
Co-authored-by: Anthony Duong <[email protected]>
-
changes from CR
-
fixing formatting
-
Update sae_lens/config.py
Co-authored-by: Anthony Duong <[email protected]>
-
changes from CR
-
fixing registry.py naming
-
Update sae_lens/saes/sae.py
Co-authored-by: Anthony Duong <[email protected]>
-
rename meta -> metadata
-
fixing tests
-
updating docs
-
fixing docs
-
fixing tests
Co-authored-by: Anthony Duong <[email protected]>
-
chore: versioned docs (#485)
-
feat: arch configs
-
6.0.0-rc.2
Automatically generated by python-semantic-release
-
chore: improving mkdocs setup with mike (#487)
-
updating docs to install pre-release version of saelens
-
feat: Decouple training from LLM activation store / LLM evals (#496)
-
Refactor activations store to be an iterator
-
fixing some activations store tests
-
wip: removing hook_layer param
-
adding early stopping support to the HF model proxy wrapper
-
WIP removing dependency of act store and model from trainer
-
fixing tests
-
fixing activations cache runner
-
fixing linting and types
-
Update benchmark/test_language_model_sae_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update benchmark/test_language_model_sae_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update benchmark/test_language_model_sae_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update benchmark/test_language_model_sae_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update benchmark/test_language_model_sae_runner_multiple_devices.py
Co-authored-by: Anthony Duong <[email protected]>
- Update benchmark/test_language_model_sae_runner_multiple_devices.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_mixing_buffer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/llm_sae_training_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/llm_sae_training_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/load_model.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/training/activation_scaler.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/training/sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_activation_scaler.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_activation_scaler.py
Co-authored-by: Anthony Duong <[email protected]>
-
fixes from CR
-
fixing type checking - thanks Anthony!
-
fixing formatting
-
fixing notebook
-
Update sae_lens/util.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/util.py
Co-authored-by: Anthony Duong <[email protected]>
- changes from CR
Co-authored-by: Anthony Duong <[email protected]>
- 6.0.0-rc.3
Automatically generated by python-semantic-release
-
feat: Final 6.0.0 changes (#499)
-
fixing metadata issues in SAELens
-
changing from_pretrained() to directly return the SAE instead of a tuple of objects
-
fixing call error
-
fixing test_sae_fold_w_dec_norm_all_architectures
-
removing unused file
-
fixing bugs around serializing and unserializing configs
-
deleting layer_norm normalization - this is never used
-
removing mse_normalization, tanh-relu, and noise_scale, as these are basically never used
-
fixing test
-
fixing eval_all_loadable_saes benchmark
-
letting claude write some tests for neuronpedia integration
-
claude continuing to go ham on neuronpedia tests
-
fixing LLM CLI runner
-
fixing linting
-
turning SAEMetadata into more of a dict-like class to decouple it from LLM-trained SAEs
-
cleaning up tutorials
-
adding migration docs
-
further doc updates
-
Update docs/migrating.md
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/saes/sae.py
Co-authored-by: Anthony Duong <[email protected]>
- Update docs/training_saes.md
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/test_llm_sae_training_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/training/sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
-
changes from CR
-
fixing formatting
-
fixing typing in test
-
fixing connor_rob_hook_z SAE loading
-
fixing hook_z reshaping for pretrained saes more robustly
Co-authored-by: Anthony Duong <[email protected]>
- 6.0.0-rc.4
Automatically generated by python-semantic-release
-
try bumping cache to solve CI disk space issue
-
allow old usage of SAE.from_pretrained() to continue to work (#501)
-
feat: make from_pretrained() gracefully fall back to original behavior
-
6.0.0-rc.5
Automatically generated by python-semantic-release
Co-authored-by: Curt Tigges <[email protected]>
Co-authored-by: Curt Tigges <[email protected]>
Co-authored-by: Anthony Duong <[email protected]>
Co-authored-by: github-actions <[email protected]> (b866149)
v6.0.0-rc.5
v6.0.0-rc.4
v6.0.0-rc.4 (2025-07-14)
Feature
-
feat: Final 6.0.0 changes (#499)
-
fixing metadata issues in SAELens
-
changing from_pretrained() to directly return the SAE instead of a tuple of objects
-
fixing call error
-
fixing test_sae_fold_w_dec_norm_all_architectures
-
removing unused file
-
fixing bugs around serializing and unserializing configs
-
deleting layer_norm normalization - this is never used
-
removing mse_normalization, tanh-relu, and noise_scale, as these are basically never used
-
fixing test
-
fixing eval_all_loadable_saes benchmark
-
letting claude write some tests for neuronpedia integration
-
claude continuing to go ham on neuronpedia tests
-
fixing LLM CLI runner
-
fixing linting
-
turning SAEMetadata into more of a dict-like class to decouple it from LLM-trained SAEs
-
cleaning up tutorials
-
adding migration docs
-
further doc updates
-
Update docs/migrating.md
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/saes/sae.py
Co-authored-by: Anthony Duong <[email protected]>
- Update docs/training_saes.md
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/test_llm_sae_training_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/training/sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
-
changes from CR
-
fixing formatting
-
fixing typing in test
-
fixing connor_rob_hook_z SAE loading
-
fixing hook_z reshaping for pretrained saes more robustly
Co-authored-by: Anthony Duong <[email protected]> (62dca28)
v5.11.0
v6.0.0-rc.3
v6.0.0-rc.3 (2025-07-11)
Chore
Feature
-
feat: Decouple training from LLM activation store / LLM evals (#496)
-
Refactor activations store to be an iterator
-
fixing some activations store tests
-
wip: removing hook_layer param
-
adding early stopping support to the HF model proxy wrapper
-
WIP removing dependency of act store and model from trainer
-
fixing tests
-
fixing activations cache runner
-
fixing linting and types
-
Update benchmark/test_language_model_sae_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update benchmark/test_language_model_sae_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update benchmark/test_language_model_sae_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update benchmark/test_language_model_sae_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update benchmark/test_language_model_sae_runner_multiple_devices.py
Co-authored-by: Anthony Duong <[email protected]>
- Update benchmark/test_language_model_sae_runner_multiple_devices.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_mixing_buffer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/llm_sae_training_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/llm_sae_training_runner.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/load_model.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/training/activation_scaler.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/training/sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_activation_scaler.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_activation_scaler.py
Co-authored-by: Anthony Duong <[email protected]>
-
fixes from CR
-
fixing type checking - thanks Anthony!
-
fixing formatting
-
fixing notebook
-
Update sae_lens/util.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/training/test_sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/util.py
Co-authored-by: Anthony Duong <[email protected]>
- changes from CR
Co-authored-by: Anthony Duong <[email protected]> (d9921e2)
Unknown
- updating docs to install pre-release version of saelens (
f1c0586)
v5.10.7
v5.10.6
v6.0.0-rc.2
v6.0.0-rc.2 (2025-05-28)
Chore
Feature
- feat: arch configs (
b6b77f3)
Unknown
-
Refactor arch configs (#468)
-
wip: refactoring architecture configs
-
move to separate config classes per architecture
-
wip: fixing tests
-
working on getting more tests passing
-
fixing more tests / typing issues
-
adding support for converting from old config formats
-
fixing more tests
-
fixing more tests
-
fixing more tests and typings
-
fixing more tests
-
fixing linting
-
ensuring runner config is also uploaded if available
-
fix init to match old heuristic init
-
Update sae_lens/config.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/config.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/training/sae_trainer.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/refactor_compatibility/test_jumprelu_sae_equivalence.py
Co-authored-by: Anthony Duong <[email protected]>
- Update tests/refactor_compatibility/test_jumprelu_sae_equivalence.py
Co-authored-by: Anthony Duong <[email protected]>
- Update sae_lens/util.py
Co-authored-by: Anthony Duong <[email protected]>
-
changes from CR
-
fixing formatting
-
Update sae_lens/config.py
Co-authored-by: Anthony Duong <[email protected]>
-
changes from CR
-
fixing registry.py naming
-
Update sae_lens/saes/sae.py
Co-authored-by: Anthony Duong <[email protected]>
-
rename meta -> metadata
-
fixing tests
-
updating docs
-
fixing docs
-
fixing tests
Co-authored-by: Anthony Duong <[email protected]> (5063a29)