Skip to content

Trying to train on SAE failure #38

@Noam-Diamant

Description

@Noam-Diamant

Hi! When I tried to train SAE with your repo, I encountered a few problems and would appreciate some help. First, when I ran the code in the first form attached below, I got the following error message:

Image

Even when I added the parameter steps=100 to the trainSAE function, I still got a different error:

Image

The code I used (taken from your readme in the section on training your own SAE):

!pip install dictionary-learning
from nnsight import LanguageModel
from dictionary_learning import ActivationBuffer, AutoEncoder
from dictionary_learning.trainers import StandardTrainer
from dictionary_learning.training import trainSAE

device = "cuda:0"
model_name = "EleutherAI/pythia-70m-deduped" # can be any Huggingface model

model = LanguageModel(
model_name,
device_map=device,
)
submodule = model.gpt_neox.layers[1].mlp # layer 1 MLP
activation_dim = 512 # output dimension of the MLP
dictionary_size = 16 * activation_dim

data = iter(
[
"This is some example data",
"In real life, for training a dictionary",
"you would need much more data than this",
]
)
buffer = ActivationBuffer(
data=data,
model=model,
submodule=submodule,
d_submodule=activation_dim, # output dimension of the model component
n_ctxs=3e4, # you can set this higher or lower dependong on your available memory
device='cpu',
) # buffer will yield batches of tensors of dimension = submodule's output dimension

trainer_cfg = {
"trainer": StandardTrainer,
"dict_class": AutoEncoder,
"activation_dim": activation_dim,
"dict_size": dictionary_size,
"lr": 1e-3,
"device": device,
}

ae = trainSAE(
data=buffer, # you could also use another (i.e. pytorch dataloader) here instead of buffer
trainer_configs=[trainer_cfg],
steps=100,
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions