Guidance for AI coding agents contributing to TransformerLens.
This file is the single source of truth. Vendor-specific files (CLAUDE.md, .github/copilot-instructions.md, .cursor/rules/transformerlens.mdc) and sub-folder AGENTS.md files all defer here.
Just want to use the library? Start with the README. Just want to run the tests? Skip to §3 Quickstart. Contributing? Read on.
- Use
uv, notpiporpoetry(uv sync). - Source
.env(set -a; source .env; set +a) before any HF-Hub command. - Base PRs on
dev, notmain. Never name a branchmainordev. - Mirror HookedTransformer → TransformerBridge when behaviour exists in both (§2).
make format+uv run mypy .before push — no pre-commit hook.- Never add
# type: ignore(§10). - Never dismiss a failing test as "pre-existing" (§10).
Sub-folder rules: tests/AGENTS.md · supported_architectures/AGENTS.md · tools/model_registry/AGENTS.md.
TransformerLens — mechanistic-interpretability library. Loads 9,000+ models across 50+ architecture families (see supported_models.json) and exposes internal activations through a hook system for caching, editing, and ablating intermediate state. Built on HuggingFace transformers.
| System | Status | Lives in | Numerics | Registry |
|---|---|---|---|---|
TransformerBridge |
v3 — default for new work | transformer_lens/model_bridge/ | Raw HF weights by default; bridge.enable_compatibility_mode() for HT-equivalent |
transformer_lens/tools/model_registry/data/supported_models.json |
HookedTransformer |
Legacy, maintenance mode, deprecated in 3.0 | transformer_lens/HookedTransformer.py + transformer_lens/components/ | Folds LayerNorm + centres weights → does NOT match HF | transformer_lens/supported_models.py (HT-only) |
⚠ The HookedTransformer acceptance suite is quarantined (test_hooked_transformer.py, test_hooked_encoder.py, test_hooked_encoder_decoder.py; see QUARANTINES.md). HT changes land untested at the acceptance level — extra manual care required.
Bridge architecture-adapter pattern: each HF architecture has one file in supported_architectures/ mapping HF module paths to canonical names. Bridge hooks are architecture-native (e.g. blocks.{i}.hook_out); HT-style aliases live in bridge.py.
Mirroring rule: if you change HookedTransformer behaviour that has a TransformerBridge counterpart, update both in the same PR. supported_models.py is HT-only — Bridge-only models go in the Bridge registry data file.
# Bootstrap uv if missing
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install — uv only (not pip, not poetry)
uv sync
source .venv/bin/activate
# First-time only: create .env from the template, then fill in HF_TOKEN
cp .env.example .env
# Source HF token before any HF-Hub-hitting command
set -a; source .env; set +a
# Tests
make unit-test # fast, no model loads
make integration-test # cross-component
make acceptance-test # end-to-end
make docstring-test # doctest + doctest-plus
make notebook-test # slow; subset run in CI
make test-pr # unit + docstring + acceptance + integration (PR-review surface)
make test # everything (long; includes benchmarks + notebooks)
# Format + typecheck — no pre-commit hook; run manually before push
make format # pycln + isort + black
make check-format # CI-equivalent check
uv run mypy .
# Docs
uv run docs-hot-reload # live preview
uv run build-docs # build to docs/build/Python: >=3.10, <4.0. CI tests 3.10, 3.11, 3.12. Format/type/docstring checks run on 3.12.
Windows: use WSL2 — the bash invocations above (source, set -a; source .env; set +a, heredocs in slash commands) don't translate to PowerShell.
| Path | What's there |
|---|---|
| transformer_lens/ | Core package |
| transformer_lens/HookedTransformer.py | Legacy HookedTransformer API |
| transformer_lens/HookedEncoder.py, HookedEncoderDecoder.py, HookedAudioEncoder.py | Encoder-only / seq2seq / audio variants |
| transformer_lens/model_bridge/ | TransformerBridge system |
| transformer_lens/model_bridge/supported_architectures/ | One adapter file per HF architecture |
| transformer_lens/model_bridge/generalized_components/ | Bridge-side reusable components |
| transformer_lens/components/ | HT-side components (attention, MLP, LN, embed) |
| transformer_lens/factories/ | architecture_adapter_factory.py, mlp_factory.py, activation_function_factory.py |
| transformer_lens/config/ | HookedTransformerConfig and TransformerBridgeConfig |
| transformer_lens/utilities/ | Device management, weight processing, HF utilities |
| transformer_lens/hook_points.py | HookPoint class and LensHandle |
| transformer_lens/supported_models.py | HT-only registry (OFFICIAL_MODEL_NAMES, MODEL_ALIASES) |
| transformer_lens/tools/model_registry/ | Bridge-side registry + verify_models.py benchmark suite |
| transformer_lens/patching.py, evals.py | Activation patching, IOI, ROME, etc. |
| tests/unit/, tests/integration/, tests/acceptance/, tests/benchmarks/, tests/mps/ | Test tiers |
| demos/ | Jupyter notebooks; a subset runs in CI under nbval with sanitization from demos/doc_sanitize.cfg |
| docs/source/content/ | Sphinx markdown sources |
| docs/source/content/adapter_development/ | Adapter-authoring guides — read these before adding a new architecture |
| makefile | Canonical test/format/docs targets |
| pyproject.toml | Deps, pytest / mypy / format / build config |
| .github/workflows/checks.yml | CI gates |
| .github/PULL_REQUEST_TEMPLATE.md | PR template + base-branch rules |
- HT canonical: uniform across architectures —
hook_embed,blocks.{i}.hook_resid_pre,blocks.{i}.attn.hook_q,blocks.{i}.hook_resid_post. - Bridge-native: architecture-shaped —
blocks.{i}.hook_out,blocks.{i}.attn.q.hook_out. HT aliases registered viabuild_alias_to_canonical_map()in bridge.py.
Prefer Bridge-native names in new code. Raw-HF-forward drivers comparing against boot_transformers must match its load configuration (fp32, eager attention) and probe for optional features like resid_mid rather than assume.
Adapters are written per architecture family, not per individual model — adding gpt2 registers all GPT-2 variants. Full workflow (starter-adapter table, 4-place registration, common gotchas, anti-patterns): supported_architectures/AGENTS.md. Verification flow: tools/model_registry/AGENTS.md. Claude Code users: invoke /add-model-support <hf_repo>.
When picking solutions to any problem, prioritize by research impact, not implementation ease. A correct, broadly-applicable feature is worth more than a one-off shortcut.
Base branch:
dev— default for all PRs (new features, refactors, docs, most bug fixes).main— only for bug fixes against the currently-released version. PRs tomainare made by maintainers; request permission before basing offmain.
Branch naming:
- Do NOT name your branch
mainordev— these conflict with the canonical branches when maintainers periodically refresh PRs against upstream. - Include the word
docsin the branch name if your PR is primarily a docs change (this triggers the docs build job). - New branches must track their own remote —
git push -u origin <branch>from your branch, not frommain/dev.
Pre-push checks: no pre-commit hook — make format + uv run mypy . manually.
PR template: .github/PULL_REQUEST_TEMPLATE.md. No conventional-commits enforcement.
Changelog: no per-PR file. Note user-facing changes in your PR description and commit messages — release essays in news/ and GitHub Releases are drafted by a maintainer from those at release time.
From .github/workflows/checks.yml:
| Job | Runs | When it fails, run locally |
|---|---|---|
compatibility-checks |
make unit-test + make acceptance-test + uv build × py 3.10 / 3.11 / 3.12 |
make unit-test / make acceptance-test / uv build. Repro Python-specific failures with uv python install <ver> then uv run --python <ver> pytest … |
mps-checks |
macOS MPS unit + integration + smoke (PRs targeting main or pushes to main only) |
TRANSFORMERLENS_ALLOW_MPS=1 uv run pytest tests/mps on a Mac with MPS |
format-check |
make check-format |
make format (writes the fix) then commit |
type-check |
uv run mypy . |
uv run mypy . — fix the error; never add # type: ignore |
docstring-test |
make docstring-test |
make docstring-test — failures are doctest mismatches in transformer_lens/ |
coverage-test |
Full suite + coverage artifact | make test-pr covers most of this; full reproduction with make coverage-report-test |
notebook-checks |
nbval over subset of demos/*.ipynb; HF_TOKEN-gated notebooks skip when secret absent |
uv run pytest --nbval-sanitize-with demos/doc_sanitize.cfg demos/<notebook>.ipynb |
build-docs |
Sphinx build (push to main/dev or branch containing docs) |
uv run build-docs (sources .env if needed for HF-gated doctest examples) |
deploy-docs |
GitHub Pages (push to main only) |
Maintainer-only; rarely the contributor's fault — usually a build-docs artifact issue |
In-progress PR runs cancel on new commit; tag/release runs do not.
Test failures:
- Never dismiss a failing test as "pre-existing" — investigate every failure.
- Never fabricate test counts or claim passes without running. Reviewers re-run.
- Never add
@pytest.mark.skip/xfail/ platform-skip to make CI green. Debug the bug. - If new tests surface pre-existing bugs, fix them — don't punt.
Code quality:
- Never add
# type: ignore— useisinstance/typing.cast. - Comments: terse one-liners; inline comments explain WHY, not WHAT.
- Never reference plan-file details (audit IDs, "see plan section X") in source.
Numerical work:
- Never claim drift is "fp noise" without empirical evidence — bugs and accumulated rounding look identical at noise scale.
- Never run model verification or benchmarks in parallel — a single CUDA/MPS device OOMs.
Environment:
uvonly — neverpiporpoetry. Run viauv run <cmd>ormake.- Source
.envbefore any HF-Hub-hitting command.HF_TOKENrequired for Llama, Mistral, Gemma, gated Qwen. - No pre-commit hook —
make format+uv run mypy .manually before push.
Before declaring a task complete:
| Task type | Must do |
|---|---|
| Bug fix | Reproduce in a test, fix, confirm test passes, make format, uv run mypy . |
| New adapter | Adapter file + 4-place registration + integration test asserting HF logit parity + verify_models run (see supported_architectures/AGENTS.md) |
| Docs change | uv run build-docs succeeds; branch name contains docs so the docs job triggers in CI |
| Notebook change | pytest --nbval-sanitize-with demos/doc_sanitize.cfg demos/<notebook>.ipynb passes locally |
| Anything else | make format + uv run mypy . + make test-pr clean before push |
Claude Code: /task-complete automates the last row. See §15 Workflow shortcuts.
- docs/source/content/migrating_to_v3.md — HT → Bridge migration recipes
- docs/source/content/adapter_development/ — adapter authoring deep dive
- docs/source/content/compatibility_mode.md — when to call
bridge.enable_compatibility_mode(), what each flag does, four-quadrant test matrix - docs/source/content/debugging_numerical_divergence.md — bisection workflow for HT-vs-Bridge / Bridge-vs-HF logit drift
- tests/QUARANTINES.md — inventory of every
skip/xfailand when each can be un-skipped
Gitignored, first-class (entries checked in):
transformer_lens/scratch.py— one-off bisection scripts, ad-hoc imports..adapter-workspace/— adapter WIP (notes, dumps, repros).
Load-bearing pins live in pyproject.toml:
| Pin | Where | Why it matters |
|---|---|---|
transformers>=5.4.0 |
[project] dependencies |
The Bridge adapter contract is written against HF module layouts; every minor HF release can break adapter component-mappings. Bumping is a real test pass. |
torch>=2.6 |
[project] dependencies |
Hook system relies on PyTorch's forward / backward hook semantics; major torch bumps occasionally change ordering. |
accelerate>=0.23.0 |
[project] dependencies |
Required for Llama-family loading. |
numpy>=1.24 / >=1.26 |
[project] dependencies (python-version-conditional) |
Doctest float formatting can drift across NumPy versions. |
isort==5.8.0 |
[dependency-groups] dev (exact) |
Format check pins to exactly this version; a bump flips the formatting of every file. |
Bumping upstream pins:
- Bump in
pyproject.toml,uv lock. make test-pr— expect real breakages, don't pin around them.uv run python -m transformer_lens.tools.model_registry.verify_models --architectures <canonical>to catch numerical regressions.- Land in a focused PR — pin bumps are reviewed separately (wide blast radius).
HF specifically: a transformers minor bump that adds or renames an architecture impacts _HF_PASSTHROUGH_ATTRS in _bridge_builder.py and SUPPORTED_ARCHITECTURES in architecture_adapter_factory.py.
Claude Code users have slash commands in .claude/commands/ that wrap common workflows. Non-Claude agents (Cursor, Codex, Copilot, Aider, etc.) can run the manual equivalents:
| Claude shortcut | Manual equivalent | Reference |
|---|---|---|
/test-unit |
make unit-test |
§3 Quickstart |
/test-all |
make test (long) |
§3 Quickstart |
/format |
make format && uv run mypy . |
§3 Quickstart |
/typecheck |
uv run mypy . |
§3 Quickstart |
/build-docs |
set -a; source .env; set +a && uv run build-docs |
§3 Quickstart |
/verify-model <repo> |
set -a; source .env; set +a then uv run python -m transformer_lens.tools.model_registry.verify_models --model <repo> --dry-run first, confirm, then drop --dry-run |
tools/model_registry/AGENTS.md |
/add-model-support <hf_repo> |
Follow the 4-way branch + adapter-authoring workflow | supported_architectures/AGENTS.md |
/task-complete |
make format && uv run mypy . && make test-pr (loop until clean) |
§11 Done checklist |