Skip to content

Releases: decoderesearch/SAELens

v1.7.0

08 May 15:47

Choose a tag to compare

v1.7.0 (2024-05-08)

Feature

  • feat: Add torch compile (#129)

  • Surface # of eval batches and # of eval sequences

  • fix formatting

  • config changes

  • add compilation to lm_runner.py

  • remove accidental print statement

  • formatting fix (5c41336)

  • feat: Change eval batch size (#128)

  • Surface # of eval batches and # of eval sequences

  • fix formatting

  • fix print statement accidentally left in (758a50b)

v1.6.1

07 May 16:45

Choose a tag to compare

v1.6.1 (2024-05-07)

Fix

  • fix: Revert "feat: Add kl eval (#124)" (#127)

This reverts commit c1d9cbe. (1a0619c)

v1.6.0

07 May 14:11

Choose a tag to compare

v1.6.0 (2024-05-07)

Feature

  • feat: Add bf16 autocast (#126)

  • add bf16 autocast and gradient scaling

  • simplify autocast setup

  • remove completed TODO

  • add autocast dtype selection (generally keep bf16)

  • formatting fix

  • remove autocast dtype (8e28bfb)

v1.5.0

07 May 12:00

Choose a tag to compare

v1.5.0 (2024-05-07)

Feature

  • feat: Add kl eval (#124)

  • add kl divergence to evals.py

  • fix linter (c1d9cbe)

Unknown

  • major: How we train saes replication (#123)

  • l1 scheduler, clip grad norm

  • add provisional ability to normalize activations

  • notebook

  • change heuristic norm init to constant, report b_e and W_dec norms (fix tests later)

  • fix mse calculation

  • add benchmark test

  • update heuristic init to 0.1

  • make tests pass device issue

  • continue rebase

  • use better args in benchmark

  • remove stack in get activations

  • broken! improve CA runner

  • get cache activation runner working and add some tests

  • add training steps to path

  • avoid ghost grad tensor casting

  • enable download of full dataset if desired

  • add benchmark for cache activation runner

  • add updated tutorial

  • format


Co-authored-by: Johnny Lin <[email protected]> (5f46329)

v1.4.0

05 May 00:47

Choose a tag to compare

v1.4.0 (2024-05-05)

Feature

  • feat: Store state to allow resuming a run (#106)

  • first pass of saving

  • added runner resume code

  • added auto detect most recent checkpoint code

  • make linter happy (and one small bug)

  • blak code formatting

  • isort

  • help pyright

  • black reformatting:

  • activations store flake

  • pyright typing

  • black code formatting

  • added test for saving and loading

  • bigger training set

  • black code

  • move to pickle

  • use pickle because safetensors doesn't support all the stuff needed for optimizer and scheduler state

  • added resume test

  • added wandb_id for resuming

  • use wandb id for checkpoint

  • moved loaded to device and minor fixes to resuming


Co-authored-by: David Chanin <[email protected]> (4d12e7a)

Unknown

  • Fix: sparsity norm calculated at incorrect dimension. (#119)

  • Fix: sparsity norm calculated at incorrect dimension.

For L1 this does not effect anything as essentially it's calculating the abs() and average everything. For L2 this is problematic as L2 involves sum and sqrt. Unexpected behaviors occur when x is of shape (batch, sen_length, hidden_dim).

  • Added tests.

  • Changed sparsity calculation to handle 3d inputs. (ce95fb2)

v1.3.0

03 May 10:11

Choose a tag to compare

v1.3.0 (2024-05-03)

Feature

  • feat: add activation bins for neuronpedia outputs, and allow customizing quantiles (#113) (05d650d)

  • feat: Update for Neuropedia auto-interp (#112)

  • cleanup Neuronpedia autointerp code

  • Fix logic bug with OpenAI key


Co-authored-by: Joseph Bloom <[email protected]> (033283d)

  • feat: SparseAutoencoder.from_pretrained() similar to transformer lens (#111)

  • add partial work so David can continue

  • feat: adding a SparseAutoencoder.from_pretrained() function


Co-authored-by: jbloomaus <[email protected]> (617d416)

Fix

  • fix: replace list_files_info with list_repo_tree (#117) (676062c)

  • fix: Improved activation initialization, fix using argument to pass in API key (#116) (7047bcc)

v1.2.0

29 Apr 16:18

Choose a tag to compare

v1.2.0 (2024-04-29)

Feature

  • feat: breaks up SAE.forward() into encode() and decode() (#107)

  • breaks up SAE.forward() into encode() and decode()

  • cleans up return typing of encode by splitting into a hidden and public function (7b4311b)

v1.1.0

29 Apr 11:57

Choose a tag to compare

v1.1.0 (2024-04-29)

Feature

  • feat: API for generating autointerp + scoring for neuronpedia (#108)

  • API for generating autointerp for neuronpedia

  • Undo pytest vscode setting change

  • Fix autointerp import

  • Use pypi import for automated-interpretability (7c43c4c)

v1.0.0

27 Apr 17:22

Choose a tag to compare

v1.0.0 (2024-04-27)

Breaking

  • chore: empty commit to bump release

BREAKING CHANGE: v1 release (2615a3e)

Chore

  • chore: fix outdated lr_scheduler_name in docs (#109)

  • chore: fix outdated lr_scheduler_name in docs

  • add tutorial hparams (7cba332)

Unknown

  • BREAKING CHANGE: 1.0.0 release

BREAKING CHANGE: 1.0.0 release (c23098f)

v0.7.0

24 Apr 09:17

Choose a tag to compare

v0.7.0 (2024-04-24)

Feature

  • feat: make a neuronpedia list with features via api call (#101) (23e680d)

Unknown

  • Merge pull request #100 from jbloomAus/np_improvements

Improvements to Neuronpedia Runner (5118f7f)

  • neuronpedia: save run settings to json file to avoid errors when resuming later. automatically skip batch files that already exist (4b5412b)

  • skip batch file if it already exists (7d0e396)

  • neuronpedia: include log sparsity threshold in skipped_indexes.json (5c967e7)