Conversation
|
This PR is tested by the below script and works well: from data.catalog import DataCatalog
from evaluation import Benchmark
import evaluation.catalog as evaluation_catalog
from models.catalog import ModelCatalog
from random import seed
from methods import CFVAE
RANDOM_SEED = 54321
seed(
RANDOM_SEED
) # set the random seed so that the random permutations can be reproduced again
# load a catalog dataset
data_name = "adult"
dataset = DataCatalog(data_name, "mlp", 0.8)
# load artificial neural network from catalog
model = ModelCatalog(dataset, "mlp", "pytorch")
# get factuals from the data to generate counterfactual examples
factuals = (dataset._df_train).sample(n=10, random_state=RANDOM_SEED)
# load a recourse model and pass black box model
cfvae = CFVAE(model)
# generate counterfactual examples
counterfactuals = cfvae.get_counterfactuals(factuals)
# Generate Benchmark for recourse method, model and data
benchmark = Benchmark(model, cfvae, factuals)
evaluation_measures = [
evaluation_catalog.YNN(benchmark.mlmodel, {"y": 5, "cf_label": 1}),
evaluation_catalog.Distance(benchmark.mlmodel),
evaluation_catalog.SuccessRate(),
evaluation_catalog.Redundancy(benchmark.mlmodel, {"cf_label": 1}),
evaluation_catalog.ConstraintViolation(benchmark.mlmodel),
evaluation_catalog.AvgTime({"time": benchmark.timer}),
]
df_benchmark = benchmark.run_benchmark(evaluation_measures)
print(df_benchmark) |
|
|
…t feature-format loss
zkhotanlou
left a comment
There was a problem hiding this comment.
Could you please resolve the reproduction assertion to ensure the script runs as a unit test and aligns with the paper’s reported results?
This update also includes a few behaviour changes:
Now this implementation is even more loyal to the original code. |
zkhotanlou
left a comment
There was a problem hiding this comment.
Thanks for your changes the unit tests pass correctly now! could you resolve the merge conflicts with the new changes on main branch and also please run the pre-commit hooks so that both the checks get approved.
There was a problem hiding this comment.
This commit drops torch's GPU support since pip install -r requirements-dev.txt in pre-commit won't read -f https://download.pytorch.org/whl/torch_stable.html and therefore cannot find torch==1.7.0+cu110.
This is a backwards change. Please consider modifying .github/workflows/pre-commit.yaml later, to add -f xxx after pip install xxx, or to simply install torch+cu version manually.
|
This is an implementation of the "CFVAE"[1] recourse method. The level of reproduction is on level1 as the unit tests checks the implementation could reproduce results reported in the paper for adult dataset on neural network. [1] Preserving causal constraints in counterfactual explanations for machine learning classifiers |
Add support for CFVAE
Reproducibility
Using the binaries the author provides, the results can be replicated in the format of their metrics. See reproduce.py. All binaries are embedded with importlib, which is backported to python<=3.8 by installing importlib_resources.
Implementation
There're plenty of details in the author's code that are different from the paper or make zero sense. Important ones are marked with # Quirk
Unless marked, quirks are kept AS IS. This PR is trying the best to replicate the original code behaviours. Besides, this PR fixed tons of non-understandable/wrong variable names, added vital comments, and cleaned the code as much as possible.
Trivials