We forked Joseph Bloom's SAE training codebase, and are using it to train SAEs on OthelloGPT.
conda create --name mats_sae_training python=3.11 -y
conda activate mats_sae_training
pip install -r requirements.txt
If conda activate mats_sae_training doesn't work, try source activate mats_sae_training.
- othellogpt_train_sae.ipynb- notebook to train SAEs on OthelloGPT
- othellogpt_probe_analysis.ipynb- compare SAE enc/dec directions with probe directions
- othellogpt_interp.ipynb
- othellogpt_board_analysis.ipynb