Releases: saprmarks/dictionary_learning
v0.1.0
v0.1.0 (2025-02-12)
Feature
- feat: pypi packaging and auto-release with semantic release (0ff8888)
Unknown
- Merge pull request #37 from chanind/pypi-package
feat: pypi packaging and auto-release with semantic release (a711efe)
- 
simplify matryoshka loss ( 43421f5)
- 
Use torch.split() instead of direct indexing for 25% speedup ( 505a445)
- 
Fix matryoshka spelling ( aa45bf6)
- 
Fix incorrect auxk logging name ( 784a62a)
- 
Add citation ( 77f2690)
- 
Make sure to detach reconstruction before calculating aux loss ( db2b564)
- 
Merge pull request #36 from saprmarks/aux_loss_fixes 
Aux loss fixes, standardize decoder normalization (34eefda)
- 
Standardize and fix topk auxk loss implementation ( 0af1971)
- 
Normalize decoder after optimzer step ( 200ed3b)
- 
Remove experimental matroyshka temperature ( 6c2fcfc)
- 
Make sure x is on the correct dtype for jumprelu when logging ( c697d0f)
- 
Import trainers from correct relative location for submodule use ( 8363ff7)
- 
By default, don't normalize Gated activations during inference ( 52b0c54)
- 
Also update context manager for matroyshka threshold ( 65e7af8)
- 
Disable autocast for threshold tracking ( 17aa5d5)
- 
Add torch autocast to training loop ( 832f4a3)
- 
Save state dicts to cpu ( 3c5a5cd)
- 
Add an option to pass LR to TopK trainers ( 8316a44)
- 
Add April Update Standard Trainer ( cfb36ff)
- 
Merge pull request #35 from saprmarks/code_cleanup 
Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers (f19db98)
- 
Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers ( 9751c57)
- 
Merge pull request #34 from adamkarvonen/matroyshka 
Add Matroyshka, Fix Jump ReLU training, modify initialization (92648d4)
- 
Add a verbose option during training ( 0ff687b)
- 
Prevent wandb cuda multiprocessing errors ( 370272a)
- 
Log dead features for batch top k SAEs ( 936a69c)
- 
Log number of dead features to wandb ( 77da794)
- 
Add trainer number to wandb name ( 3b03b92)
- 
Add notes ( 810dbb8)
- 
Add option to ignore bos tokens ( c2fe5b8)
- 
Fix jumprelu training ( ec961ac)
- 
Use kaiming initialization if specified in paper, fix batch_top_k aux_k_alpha ( 8eaa8b2)
- 
Format with ruff ( 3e31571)
- 
Add temperature scaling to matroyshka ( ceabbc5)
- 
norm the correct decoder dimension ( 5383603)
- 
Fix loading matroyshkas from_pretrained() ( 764d4ac)
- 
Initial matroyshka implementation ( 8ade55b)
- 
Make sure we step the learning rate scheduler ( 1df47d8)
- 
Merge pull request #33 from saprmarks/lr_scheduling 
Lr scheduling (316dbbe)
- 
Properly set new parameters in end to end test ( e00fd64)
- 
Standardize learning rate and sparsity schedules ( a2d6c43)
- 
Merge pull request #32 from saprmarks/add_sparsity_warmup 
Add sparsity warmup (a11670f)
- 
Add sparsity warmup for trainers with a sparsity penalty ( 911b958)
- 
Clean up lr decay ( e0db40b)
- 
Track lr decay implementation ( f0bb66d)
- 
Remove leftover variable, update expected results with standard SAE improvements ( 9687bb9)
- 
Merge pull request #31 from saprmarks/add_demo 
Add option to normalize dataset, track thresholds for TopK SAEs, Fix Standard SAE (67a7857)
- 
Also scale topk thresholds when scaling biases ( efd76b1)
- 
Use the correct standard SAE reconstruction loss, initialize W_dec to W_enc.T ( 8b95ec9)
- 
Add bias scaling to topk saes ( 484ca01)
- 
Fix topk bfloat16 dtype error ( 488a154)
- 
Add option to normalize dataset activations ( 81968f2)
- 
Remove demo script and graphing notebook ( 57f451b)
- 
Track thresholds for topk and batchtopk during training ( b5821fd)
- 
Track threshold for batchtopk, rename for consistency ( 32d198f)
- 
Modularize demo script ( dcc02f0)
- 
Begin creation of demo script ( 712eb98)
- 
Fix JumpReLU training and loading ( 552a8c2)
- 
Ensure activation buffer has the correct dtype ( d416eab)
- 
Merge pull request #30 from adamkarvonen/add_tests 
Add end to end test...