Overview

Welcome to the XAI-Units package repository! This is a library to help benchmark and compare explainable AI feature attribution methods. It contains a collection of datasets and models with specific units of behaviour that are known to be challenging for feature attribution methods. The library also contains an end-to-end pipeline for applying feature attribution methods across a range of datasets/models, scoring them with metrics, then summarising the results.

Please also check out the associated paper "XAI-Units: Benchmarking Explainability Methods with Unit Tests" and visit our documentation page for additional information.

Project structure

├── data\dinosaur_images
├── demo
│   ├── tutorials
│	└── Example
├── docs
└── src\xaiunits
    ├── datagenerator
    ├── methods
    ├── metrics
    ├── model
    ├── pipeline
    └── trainer

Installation Guide

Clone the repo.
Create a virtual environment.
- python -m venv ./venv
Activate the virtual environment then navigate to the root of the repo.
Use the requirements.txt file to pip install the requirements.
- pip install -r requirements.txt
You may wish to upgrade the installed version of pytorch for GPU support. (Official benchmark models were trained on pytorch-cuda=12.1).
Install this library as a local editable package. Note the full stop . at the end of the command refers to the current directory i.e. the root of the repo
- pip install -e .

Example

This folder contains real world example of experiments using our library that reader can reference. Follow the steps below:

Step 1: Follow the installation guide above so you have an active venv with the required packages installed.
Step 2: Change the root folder to demo/example folder to ensure that relative file paths are intact, cd ./demo/example
Step 3: select the appropriate Python script to reproduce experiment of choice (e.g. python3 tabular.py)
- Tabular Experiment : tabular.py
- DeepLIFT Supplementary Experiment : deeplift_suppl_exp.py
- Image Experiment CNN : image.py
- Image Experiment ViT : image.py
- Image Experiment Text : image.py

Getting started

Here, we present a practical, end-to-end example that demonstrates how to effectively utilize one of the library's features. The Pipeline is one of the main features of the library, bringing a simple, straightforward way to write end-to-end experiments with feature attribution explainability methods.

Necessary imports to run this code:

from xaiunits.model import ContinuousFeaturesNN
from xaiunits.datagenerator import WeightedFeaturesDataset
from captum.metrics import sensitivity_max, infidelity
from captum.attr import InputXGradient, IntegratedGradients, Lime
from xaiunits.pipeline import Pipeline
from xaiunits.metrics import perturb_standard_normal, wrap_metric

Select one of the multiple datasets in the library

dataset = WeightedFeaturesDataset()

Select a model compatible with the dataset

model = ContinuousFeaturesNN(n_features=dataset.n_features, weights=dataset.weights)
# alternatively use model = dataset.generate_model()

Add explainability methods of your choice to the list

methods = [InputXGradient, IntegratedGradients, Lime]

Add the metrics you want to use to the list, using wrap_metric

metrics = [
    wrap_metric(sensitivity_max),
    wrap_metric(infidelity, perturb_func=dataset.perturb_function(), normalize=True),
]

You can add as many models as you want to the list for the Pipeline to run

models = [model]

Add as many datasets as you want to the list. Make sure models and datasets are compatible with each other

datasets = [dataset]

Create the pipeline

pipeline = Pipeline(models, datasets, methods, metrics, method_seeds=[10])

Use the features of the Pipeline

results = pipeline.run() # apply the explanation methods and evaluate them
results.print_stats() # print results of the explainability methods and the metrics
df = results.data # access the full dataframe of results

Further Resources

To expand more on the usage of models, datasets, methods, and metrics available, as well as other features the library has, such as the Autotrainer and the ExperimentPipeline, refer to the demo/tutorials folder and to the documentation.

Tutorials

library_quickstart.ipynb

custom_methods_and_custom_datasets.ipynb

dataset_reading.ipynb

image_dataset_example.ipynb

Documentation

The documentation uses Sphinx. For a local build of the documentation, ensure that requirements.txt has been installed (including Sphinx) then navigate to the docs folder and run the following command:

make html

Then access the documentation by opening the file docs/_build/html/index.html.

Current Features

Datasets

WeightedFeaturesDataset: data_generation.py
ConflictingDataset: conflicting.py
PertinentNegativesDataset: pertinent_negatives.py
ShatteredGradientsDataset: shattered_grad.py
InteractingFeatureDataset: interacting_features.py
UncertaintyAwareDataset: uncertainty_aware.py
BooleanDataset: boolean.py
BalancedImageDataset: image_generation.py
ImbalancedImageDataset: image_generation.py
TextDataset: text_dataset.py

Models

DynamicNN: dynamic.py
GenericNN: generic.py
ContinuousFeaturesNN: continuous.py
ConflictingFeaturesNN: conflicting.py
PertinentNN: pertinent_negatives.py
ShatteredGradientsNN: shattered_gradients.py
InteractingFeaturesNN: interaction_features.py
UncertaintyNN: uncertainty_model.py
PropFormulaNN: boolean.py

Methods

See methods_wrapper.py

Metrics

See metrics_wrapper.py

Pipeline

Pipeline: pipeline.py
ExperimentPipeline: pipeline.py

Trainer

AutoTrainer: trainer.py

Citation

If you find our paper or code useful in your research, please consider citing the original work:

@inproceedings{10.1145/3715275.3732186,
	author = {Lee, Jun Rui and Emami, Sadegh and Hollins, Michael David and Wong, Timothy C. H. and Villalobos S\'{a}nchez, Carlos Ignacio and Toni, Francesca and Zhang, Dekai and Dejl, Adam},
	title = {XAI-Units: Benchmarking Explainability Methods with Unit Tests},
	year = {2025},
	isbn = {9798400714825},
	publisher = {Association for Computing Machinery},
	address = {New York, NY, USA},
	url = {https://doi.org/10.1145/3715275.3732186},
	doi = {10.1145/3715275.3732186},
	booktitle = {Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency},
	pages = {2892–2905},
	numpages = {14},
	keywords = {explainable AI, feature attribution, neural networks, synthetic data, synthetic models, unit testing},
	location = {
	},
	series = {FAccT '25}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
demo		demo
docs		docs
src/xaiunits		src/xaiunits
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
image-3.png		image-3.png
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview

Table of Contents

Project structure

Installation Guide

Example

Getting started

Further Resources

Tutorials

Documentation

Current Features

Datasets

Models

Methods

Metrics

Pipeline

Trainer

Citation

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

XAI-Units/xaiunits

Folders and files

Latest commit

History

Repository files navigation

Overview

Table of Contents

Project structure

Installation Guide

Example

Getting started

Further Resources

Tutorials

Documentation

Current Features

Datasets

Models

Methods

Metrics

Pipeline

Trainer

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages