08 May 15:47

github-actions

eecf839

v1.7.0

v1.7.0 (2024-05-08)

Feature

feat: Add torch compile (#129)
Surface # of eval batches and # of eval sequences
fix formatting
config changes
add compilation to lm_runner.py
remove accidental print statement
formatting fix (5c41336)
feat: Change eval batch size (#128)
Surface # of eval batches and # of eval sequences
fix formatting
fix print statement accidentally left in (758a50b)

Assets 4

07 May 16:45

github-actions

v1.6.1

ad9418c

v1.6.1

v1.6.1 (2024-05-07)

Fix

fix: Revert "feat: Add kl eval (#124)" (#127)

This reverts commit c1d9cbe. (1a0619c)

Assets 4

07 May 14:11

github-actions

v1.6.0

7264f99

v1.6.0

v1.6.0 (2024-05-07)

Feature

feat: Add bf16 autocast (#126)
add bf16 autocast and gradient scaling
simplify autocast setup
remove completed TODO
add autocast dtype selection (generally keep bf16)
formatting fix
remove autocast dtype (8e28bfb)

Assets 4

07 May 12:00

github-actions

v1.5.0

67660ec

v1.5.0

v1.5.0 (2024-05-07)

Feature

feat: Add kl eval (#124)
add kl divergence to evals.py
fix linter (c1d9cbe)

Unknown

major: How we train saes replication (#123)
l1 scheduler, clip grad norm
add provisional ability to normalize activations
notebook
change heuristic norm init to constant, report b_e and W_dec norms (fix tests later)
fix mse calculation
add benchmark test
update heuristic init to 0.1
make tests pass device issue
continue rebase
use better args in benchmark
remove stack in get activations
broken! improve CA runner
get cache activation runner working and add some tests
add training steps to path
avoid ghost grad tensor casting
enable download of full dataset if desired
add benchmark for cache activation runner
add updated tutorial
format

Co-authored-by: Johnny Lin <[email protected]> (5f46329)

Assets 4

05 May 00:47

github-actions

v1.4.0

76d6dbc

v1.4.0

v1.4.0 (2024-05-05)

Feature

feat: Store state to allow resuming a run (#106)
first pass of saving
added runner resume code
added auto detect most recent checkpoint code
make linter happy (and one small bug)
blak code formatting
isort
help pyright
black reformatting:
activations store flake
pyright typing
black code formatting
added test for saving and loading
bigger training set
black code
move to pickle
use pickle because safetensors doesn't support all the stuff needed for optimizer and scheduler state
added resume test
added wandb_id for resuming
use wandb id for checkpoint
moved loaded to device and minor fixes to resuming

Co-authored-by: David Chanin <[email protected]> (4d12e7a)

Unknown

Fix: sparsity norm calculated at incorrect dimension. (#119)
Fix: sparsity norm calculated at incorrect dimension.

For L1 this does not effect anything as essentially it's calculating the abs() and average everything. For L2 this is problematic as L2 involves sum and sqrt. Unexpected behaviors occur when x is of shape (batch, sen_length, hidden_dim).

Added tests.
Changed sparsity calculation to handle 3d inputs. (ce95fb2)

Assets 4

03 May 10:11

github-actions

v1.3.0

36919e0

v1.3.0

v1.3.0 (2024-05-03)

Feature

feat: add activation bins for neuronpedia outputs, and allow customizing quantiles (#113) (05d650d)
feat: Update for Neuropedia auto-interp (#112)
cleanup Neuronpedia autointerp code
Fix logic bug with OpenAI key

Co-authored-by: Joseph Bloom <[email protected]> (033283d)

feat: SparseAutoencoder.from_pretrained() similar to transformer lens (#111)
add partial work so David can continue
feat: adding a SparseAutoencoder.from_pretrained() function

Co-authored-by: jbloomaus <[email protected]> (617d416)

Fix

fix: replace list_files_info with list_repo_tree (#117) (676062c)
fix: Improved activation initialization, fix using argument to pass in API key (#116) (7047bcc)

Assets 4

29 Apr 16:18

github-actions

v1.2.0

1fda458

v1.2.0

v1.2.0 (2024-04-29)

Feature

feat: breaks up SAE.forward() into encode() and decode() (#107)
breaks up SAE.forward() into encode() and decode()
cleans up return typing of encode by splitting into a hidden and public function (7b4311b)

Assets 4

29 Apr 11:57

github-actions

v1.1.0

81b3dbd

v1.1.0

v1.1.0 (2024-04-29)

Feature

feat: API for generating autointerp + scoring for neuronpedia (#108)
API for generating autointerp for neuronpedia
Undo pytest vscode setting change
Fix autointerp import
Use pypi import for automated-interpretability (7c43c4c)

Assets 4

27 Apr 17:22

github-actions

v1.0.0

6a3f447

v1.0.0

v1.0.0 (2024-04-27)

Breaking

chore: empty commit to bump release

BREAKING CHANGE: v1 release (2615a3e)

Chore

chore: fix outdated lr_scheduler_name in docs (#109)
chore: fix outdated lr_scheduler_name in docs
add tutorial hparams (7cba332)

Unknown

BREAKING CHANGE: 1.0.0 release

BREAKING CHANGE: 1.0.0 release (c23098f)

Neuronpedia: allow resuming upload (#102) (0184671)

Assets 4

24 Apr 09:17

github-actions

v0.7.0

bbb6a05

v0.7.0

v0.7.0 (2024-04-24)

Feature

feat: make a neuronpedia list with features via api call (#101) (23e680d)

Unknown

Merge pull request #100 from jbloomAus/np_improvements

Improvements to Neuronpedia Runner (5118f7f)

neuronpedia: save run settings to json file to avoid errors when resuming later. automatically skip batch files that already exist (4b5412b)
skip batch file if it already exists (7d0e396)
neuronpedia: include log sparsity threshold in skipped_indexes.json (5c967e7)

Assets 4

Releases: decoderesearch/SAELens

v1.7.0

v1.7.0 (2024-05-08)

Feature

Uh oh!

v1.6.1

v1.6.1 (2024-05-07)

Fix

Uh oh!

v1.6.0

v1.6.0 (2024-05-07)

Feature

Uh oh!

v1.5.0

v1.5.0 (2024-05-07)

Feature

Unknown

Uh oh!

v1.4.0

v1.4.0 (2024-05-05)

Feature

Unknown

Uh oh!

v1.3.0

v1.3.0 (2024-05-03)

Feature

Fix

Uh oh!

v1.2.0

v1.2.0 (2024-04-29)

Feature

Uh oh!

v1.1.0

v1.1.0 (2024-04-29)

Feature

Uh oh!

v1.0.0

v1.0.0 (2024-04-27)

Breaking

Chore

Unknown

Uh oh!

v0.7.0

v0.7.0 (2024-04-24)

Feature

Unknown

Uh oh!