This is the official repository for our paper Interpreting Neurons in Deep Vision Networks with Language Models, TMLR 2025. For a quick read of our work, please see our project website:
- DnD is a novel method to describe the roles of hidden neurons in vision networks with higher quality than existing neuron-level interpretability tools, establishing a new state-of-the-art.
- DnD is training-free, concept-set-free, providing generative natural language description, and can easily leverage more capable general purpose models in the future.
- Below we illustrates the pipeline of DnD (left) and the results provided by DnD and other methods with scores (right)
- Install Python (3.11.5)
- Install remaining requirements using
pip install -r requirements.txt
- Register an OpenAI API key at https://openai.com/blog/openai-api and replace all instances of
OPENAI_KEY
with your personal key. - Download the pre-trained BLIP model here and replace all instances of
BLIP_PATH
with the path to model. - Download the Broden dataset (images only) using
bash dlbroden.sh
- (Optional) Download ResNet-18 pretrained on Places-365:
bash dlzoo_example.sh
- Define the path to your datasets via
DATASET_ROOTS[<dataset_name>]
indata_utils.py
.
NOTE: Extensive use of this model may require the purchase of an API Key once total usage exceeds $5.00.
We do not include instructions to download ImageNet data, you must specify the correct path to your copy of ImageNet dataset.
- Define your OpenAI API Key
<OPENAI_KEY>
and path to pretrained BLIP<BLIP_PATH>
indescribe_neurons.py
- Specify target neuron to dissect using
--ids_to_check <string of comma-separated ids, no spaces>
- Example:'1,2,3'
- The code will dissect specified neurons in Layer 4 of ResNet-50 using ImageNet U Broden as the probing dataset.
python describe_neurons.py --ids_to_check <neuron_to_dissect>
- Implement the code to load your model and preprocess in
data_utils.py
under theget_target_model
function. - Dissect your model by running
python describe_neurons.py --ids_to_check <neuron_to_dissect> --target_model <model_name>
- Specify the layer to dissect (e.g. 'layer4')
python describe_neurons.py --ids_to_check <neuron_to_dissect> --target_layer <target_layer>
- Implement the code to load your dataset in
data_utils.py
under theget_data
function. You will need to include your own path to the dataset. - Add the name of your dataset to the choice of
--d_probe
underdescribe_neurons.py
- Dissect your model by running
python describe_neurons.py --ids_to_check <neuron_to_dissect> --d_probe <dataset_name>
Specify the device used with --device
, the default device is cuda
.
python describe_neurons.py --ids_to_check <neuron_to_dissect> --device <device>
- Results will be saved into
--results_dir <path_to_results_directory>
. If no argument is passed, results are saved into./experiments/exp_results
. - Target model activations will be saved into
--saved_acts_dir <path_to_saved_activations_directory>
. If no argument is passed, activations are saved in./experiments/saved_activations
. - You may also add
--tag <experiment_tag>
to differentiate between saved result files.
The code for all experiments in our paper can be found under the experiments
folder. Each notebook corresponds to the table or figure presented in the paper.
- Define your OpenAI API Key in
DnD_models.py
and path to pretrained BLIP when loading BLIP in the notebook. - Results will be saved into
./experiments/exp_results
- Download the Tile2Vec Resnet-18 checkpoint here and save the file in
tile2vec-master/models/naip_trained.ckpt
- Unzip tiles.zip in
tile2vec-master/data/
and place intile2vec-master/data/tiles/
Due to the generative nature of our model, separate executions may yield slightly varying neuron descriptions. Results from reproduced experiments may also differ marginally from those presented in the paper.
- CLIP-Dissect: https://github.com/Trustworthy-ML-Lab/CLIP-dissect
- CLIP: https://github.com/openai/CLIP
- BLIP: https://huggingface.co/Salesforce/blip-image-captioning-base
- GPT-3.5 Turbo: https://platform.openai.com/docs/models/gpt-3-5
- Stable Diffusion: https://huggingface.co/runwayml/stable-diffusion-v1-5
- Tile2Vec: https://github.com/ermongroup/tile2vec
N. Bai*, R. Iyer*, T. Oikarinen, A. Kulkarni, and T.-W. Weng, Interpreting Neurons in Deep Vision Networks with Language Models, TMLR 2025.
@article{
bai2025interpreting,
title={Interpreting Neurons in Deep Vision Networks with Language Models},
author={Bai, Nicholas and Iyer, Rahul Ajay and Oikarinen, Tuomas and Kulkarni, Akshay R. and Weng, Tsui-Wei},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2025},
url={https://openreview.net/forum?id=x1dXvvElVd}
}