BlueDot AI Safety Fundamentals Alignment Course Project

This project was completed for the summer 2024 cohort of the BlueDot AI Safety Fundamentals Alignment Course.

Introduction to Mechanistic Interpretability

N. Nanda, L. Chan, T. Lieberum, J. Smith, and J. Steinhardt, ‘Progress measures for grokking via mechanistic interpretability’. arXiv, Oct. 19, 2023. doi: 10.48550/arXiv.2301.05217. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.vscode		.vscode
figures		figures
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
basic_analysis.py		basic_analysis.py
fourier_analysis.py		fourier_analysis.py
plot_losses.py		plot_losses.py
plot_training_losses.py		plot_training_losses.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
report.md		report.md