Merge pull request #43 from SewoongLee/main

adamkarvonen · web-flow · commit eb0753321b5a · 2025-08-18T14:48:23.000-07:00
Fix broken links in README
diff --git a/README.md b/README.md
@@ -61,13 +61,14 @@ This repository supports different sparse autoencoder architectures, including s
 Each sparse autoencoder architecture is implemented with a corresponding trainer that implements the training protocol described by the authors.
 This allows us to implement different training protocols (e.g. p-annealing) for different architectures without a lot of overhead.
 Specifically, this repository supports the following trainers:
-- [`StandardTrainer`](trainers/standard.py): Implements a training scheme similar to that of [Bricken et al., 2023](https://transformer-circuits.pub/2023/monosemantic-features/index.html#appendix-autoencoder).
-- [`GatedSAETrainer`](trainers/gdm.py): Implements the training scheme for Gated SAEs described in [Rajamanoharan et al., 2024](https://arxiv.org/abs/2404.16014).
-- [`TopKSAETrainer`](trainers/top_k.py): Implemented the training scheme for Top-K SAEs described in [Gao et al., 2024](https://arxiv.org/abs/2406.04093).
-- [`BatchTopKSAETrainer`](trainers/batch_top_k.py): Implemented the training scheme for Batch Top-K SAEs described in [Bussmann et al., 2024](https://arxiv.org/abs/2412.06410).
-- [`JumpReluTrainer`](trainers/jumprelu.py): Implemented the training scheme for JumpReLU SAEs described in [Rajamanoharan et al., 2024](https://arxiv.org/abs/2407.14435).
-- [`PAnnealTrainer`](trainers/p_anneal.py): Extends the `StandardTrainer` by providing the option to anneal the sparsity parameter p.
-- [`GatedAnnealTrainer`](trainers/gated_anneal.py): Extends the `GatedSAETrainer` by providing the option for p-annealing, similar to `PAnnealTrainer`.
+- [`StandardTrainer`](dictionary_learning/trainers/standard.py): Implements a training scheme similar to that of [Bricken et al., 2023](https://transformer-circuits.pub/2023/monosemantic-features/index.html#appendix-autoencoder).
+- [`GatedSAETrainer`](dictionary_learning/trainers/gdm.py): Implements the training scheme for Gated SAEs described in [Rajamanoharan et al., 2024](https://arxiv.org/abs/2404.16014).
+- [`TopKSAETrainer`](dictionary_learning/trainers/top_k.py): Implemented the training scheme for Top-K SAEs described in [Gao et al., 2024](https://arxiv.org/abs/2406.04093).
+- [`BatchTopKSAETrainer`](dictionary_learning/trainers/batch_top_k.py): Implemented the training scheme for Batch Top-K SAEs described in [Bussmann et al., 2024](https://arxiv.org/abs/2412.06410).
+- [`JumpReluTrainer`](dictionary_learning/trainers/jumprelu.py): Implemented the training scheme for JumpReLU SAEs described in [Rajamanoharan et al., 2024](https://arxiv.org/abs/2407.14435).
+- [`PAnnealTrainer`](dictionary_learning/trainers/p_anneal.py): Extends the `StandardTrainer` by providing the option to anneal the sparsity parameter p.
+- [`GatedAnnealTrainer`](dictionary_learning/trainers/gated_anneal.py): Extends the `GatedSAETrainer` by providing the option for p-annealing, similar to `PAnnealTrainer`.
+- [`MatryoshkaBatchTopKTrainer`](dictionary_learning/trainers/matryoshka_batch_top_k.py): Extends the `BatchTopKSAETrainer` by providing the option to apply Matryoshka-style prefix loss training, enabling hierarchical feature learning within a Top-K sparse autoencoder framework.
 
 Another key object is the `ActivationBuffer`, defined in `buffer.py`. Following [Neel Nanda's appraoch](https://www.lesswrong.com/posts/fKuugaxt2XLTkASkk/open-source-replication-and-commentary-on-anthropic-s), `ActivationBuffer`s maintain a buffer of NN activations, which it outputs in batches.