Skip to content

cataluna84/VisionInterpretability

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Vision Interpretability: Decoding CNNs

A comprehensive, interactive deep dive into how Convolutional Neural Networks (CNNs) "see" the world through three tutorial notebooks covering fundamentals and advanced feature visualization.

πŸ““ Notebooks

Segment 1: CNN Basics & Interpretability

Open In Colab

Topics:

  • Image tensors & convolution mathematics
  • Training a simple CNN on ImageNette
  • Filter & feature map visualization
  • Saliency maps (vanilla gradients)
  • Grad-CAM class activation mapping

Features: βœ… Auto-setup for Colab | βœ… LaTeX formulas | βœ… Research references


Segment 2: Activation Maximization

Open In Colab

Topics:

  • Gradient ascent optimization for feature visualization
  • Reproducing Distill.pub Circuits research
  • FFT vs pixel parameterization
  • Total variation & L2 regularization

Features: βœ… Uses torch-lucent library | βœ… Self-contained (no local deps) | βœ… Publication-quality visuals


Segment 3: Dataset Examples & Activation Spectrum

Open In Colab

Topics:

  • Finding dataset examples across activation spectrum
  • Minimum, slightly negative, slightly positive, maximum examples
  • Distill.pub style 6-column visualization layout

Features: βœ… Streaming ImageNet | βœ… W&B logging | βœ… Publication-quality Distill.pub visuals


Segment 3b: Faccent Optimization

Open In Colab

Topics:

  • Feature visualization with Faccent library
  • Advanced optimization techniques
  • Class activation mapping (CAM)

Features: βœ… Faccent library | βœ… Advanced parametrization | βœ… CAM visualization

πŸš€ Quick Start

Option 1: Google Colab (Recommended)

Click either badge above β†’ Run all cells. Setup is automatic!

Option 2: Local Setup

Requires Python 3.13+ and uv

# Clone the repository
git clone https://github.com/cataluna84/VisionInterpretability.git
cd VisionInterpretability

# Install dependencies
uv sync

# Start Jupyter
uv run jupyter lab

Then open either notebook in notebooks/.

πŸ“ Project Structure

VisionInterpretability/
β”œβ”€β”€ notebooks/
β”‚   β”œβ”€β”€ cataluna84__segment_1_intro.ipynb           # Part 1: CNN Basics
β”‚   β”œβ”€β”€ cataluna84__segment_2_activation_max.ipynb  # Part 2: Feature Viz
β”‚   β”œβ”€β”€ cataluna84__segment_3_dataset_images.ipynb  # Part 3: Dataset Examples
β”‚   β”œβ”€β”€ cataluna84__segment_3_faccent.ipynb         # Part 3b: Faccent Optimization
β”‚   β”œβ”€β”€ lucent/                     # Lucent tutorial notebooks
β”‚   β”‚   β”œβ”€β”€ tutorial.ipynb          # Getting started with Lucent
β”‚   β”‚   β”œβ”€β”€ activation_grids.ipynb  # Activation grid visualizations
β”‚   β”‚   β”œβ”€β”€ diversity.ipynb         # Feature diversity analysis
β”‚   β”‚   β”œβ”€β”€ feature_inversion.ipynb # Feature inversion techniques
β”‚   β”‚   β”œβ”€β”€ GAN_parametrization.ipynb   # GAN-based parametrization
β”‚   β”‚   β”œβ”€β”€ neuron_interaction.ipynb    # Neuron interaction analysis
β”‚   β”‚   β”œβ”€β”€ style_transfer.ipynb    # Neural style transfer
β”‚   β”‚   └── modelzoo.ipynb          # Model zoo examples
β”‚   β”œβ”€β”€ results/                    # Notebook output artifacts
β”‚   └── wandb/                      # W&B experiment logs
β”œβ”€β”€ src/segment_1_intro/            # Python modules (for Segment 1)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ data.py       # ImageNette dataset loading
β”‚   β”œβ”€β”€ models.py     # SimpleCNN, InceptionV1, training
β”‚   └── visualize.py  # Grad-CAM, Saliency Maps, plotting
β”œβ”€β”€ src/segment_3_dataset_images/   # Python modules (for Segment 3)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ activation_pipeline.py  # Activation extraction, spectrum tracking
β”‚   β”œβ”€β”€ visualization.py        # Distill.pub style plotting
β”‚   └── faccent/                # Feature visualization library
β”‚       β”œβ”€β”€ cam.py              # Class activation mapping
β”‚       β”œβ”€β”€ mask.py             # Masking utilities
β”‚       β”œβ”€β”€ objectives.py       # Optimization objectives
β”‚       β”œβ”€β”€ param.py            # Image parameterization
β”‚       β”œβ”€β”€ render.py           # Rendering engine
β”‚       β”œβ”€β”€ transform.py        # Image transforms
β”‚       β”œβ”€β”€ utils.py            # Utility functions
β”‚       └── modelzoo/           # Pretrained model loaders
β”‚           └── inceptionv1/    # InceptionV1 model
β”œβ”€β”€ scripts/                    # Notebook enhancement scripts
β”‚   β”œβ”€β”€ add_circuit_visualization.py
β”‚   β”œβ”€β”€ add_colab_support_seg3.py
β”‚   β”œβ”€β”€ add_data_dir_param.py
β”‚   β”œβ”€β”€ add_device_definition.py
β”‚   β”œβ”€β”€ add_performance_docs.py
β”‚   β”œβ”€β”€ add_plotly_setup.py
β”‚   β”œβ”€β”€ add_setup_cell_seg2.py
β”‚   β”œβ”€β”€ add_wandb_chart.py
β”‚   β”œβ”€β”€ analyze_flow.py
β”‚   β”œβ”€β”€ analyze_notebook_structure.py
β”‚   β”œβ”€β”€ check_gpu.py
β”‚   β”œβ”€β”€ complete_restructure.py
β”‚   β”œβ”€β”€ enhance_notebook_theory.py
β”‚   β”œβ”€β”€ fix_animate_sequence.py
β”‚   β”œβ”€β”€ update_notebook.py
β”‚   └── update_notebook_distill.py
β”œβ”€β”€ data/                       # Dataset files
β”‚   β”œβ”€β”€ imagenette2-320/        # ImageNette dataset
β”‚   └── segment_3_test_images/  # Test images for Segment 3
β”œβ”€β”€ docs/                       # Documentation
└── pyproject.toml              # Dependencies (UV)

πŸ“¦ Python Modules (Segment 1 Only)

segment_1_intro.data

from segment_1_intro import data

train_loader = data.load_imagenette(split="train", batch_size=32)
classes = data.IMAGENETTE_CLASSES  # 10 ImageNet classes

segment_1_intro.models

from segment_1_intro import models

model = models.load_simple_cnn(num_classes=10)
history = models.train_model(model, train_loader, val_loader, epochs=5)

segment_1_intro.visualize

from segment_1_intro import visualize

# Saliency map
saliency = visualize.compute_saliency_map(model, image, target_class=3)

# Grad-CAM
gradcam = visualize.GradCAM(model, model.conv3)
heatmap = gradcam(image, target_class=3)

segment_3_dataset_images

from segment_3_dataset_images import (
    ActivationSpectrumTrackerV2,
    FeatureOptimizer,
    plot_neuron_spectrum_distill,
)

tracker = ActivationSpectrumTrackerV2(num_neurons=10, samples_per_category=9)
optimizer = FeatureOptimizer(model)
fig = plot_neuron_spectrum_distill(
    neuron_idx=0,
    layer_name="mixed4a",
    spectrum=tracker.get_spectrum(0),
    optimized_img=optimizer.optimize_neuron("mixed4a", 0),
    negative_optimized_img=optimizer.optimize_neuron_negative("mixed4a", 0),
)

πŸ“Š Dependencies

Core (Both Segments)

  • PyTorch >= 2.5.0
  • torchvision >= 0.20.0
  • matplotlib >= 3.9.0
  • numpy >= 2.0.0

Segment 1 Specific

  • opencv-python >= 4.13.0
  • scikit-learn >= 1.5.0
  • tqdm >= 4.66.0

Segment 2 Specific

  • torch-lucent >= 0.1.8 β€” Feature visualization library (PyTorch port of Lucid)

Segment 3 Specific

  • torch-lucent >= 0.1.8 β€” Feature visualization
  • wandb >= 0.18.0 β€” Experiment tracking

🎯 What You'll Learn

Section Notebook Key Concepts
Image Representation Segment 1 Tensors $(C, H, W)$, normalization
Convolutions Segment 1 Kernels, stride, padding, formulas
CNN Training Segment 1 SimpleCNN on ImageNette
Feature Maps Segment 1 Layer activations, what CNNs detect
Saliency Maps Segment 1 $S = |\nabla_x y^c|$
Grad-CAM Segment 1 $L^c = \text{ReLU}(\sum_k \alpha_k^c A^k)$
Activation Max Segment 2 Gradient ascent, FFT parameterization
Feature Viz Segment 2 Reproducing Distill.pub Circuits
Dataset Examples Segment 3 Activation spectrum, min/max/near-threshold
Distill.pub Layout Segment 3 6-column visualization

πŸ“– References

Segment 1

Segment 2

πŸ“œ License

MIT License

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published