feat: Add support for CFVAE by Chenghao-Tan · Pull Request #28 · charmlab/recourse_benchmarks

Chenghao-Tan · 2025-10-12T21:15:37Z

Add support for CFVAE

Reproducibility

Using the binaries the author provides, the results can be replicated in the format of their metrics. See reproduce.py. All binaries are embedded with importlib, which is backported to python<=3.8 by installing importlib_resources.

Implementation

There're plenty of details in the author's code that are different from the paper or make zero sense. Important ones are marked with # Quirk

In _CFVAE, strange "0.5+" in encoder()
Continuous features reconstruction loss is scaled with the original num range, which makes no sense and is not mentioned in the paper (removed in this implementation since the built-in datasets are normalized and have already lost this sort of information)
Form-loss of categorical features summing to 1, not mentioned in the paper (removed in this implementation since regrouping the onehot categories by the column name is a dumb move and form-loss for ordinal onehot categories is not defined).
Categorical counterfactuals are not rounded to integer in the original code (rounded in this implementation to meet the framework requirements and for fair comparison)
Multi-sampling VAE is not mentioned in the paper
Original code uses ADAM instead of SGD (used SGD as in the paper for this implementation)
Non-functioning operations like denormalization (removed as much as possible in this implementation)
Hyper-params (trying the best to keep values in the paper, but still some are missing or unclear)
It's an 3-in-1 implementation of ModelBasedCF, ModelApproxCF and ExampleBasedCF. Use all losses together is not banned, unlike the original code which is written separately. Users are allowed to add their own constraint_loss_func (there's an example as static method for that) and preference_dataset. Instructions of doing so is in the function comments.
Unless marked, quirks are kept AS IS. This PR is trying the best to replicate the original code behaviours. Besides, this PR fixed tons of non-understandable/wrong variable names, added vital comments, and cleaned the code as much as possible.

Trivials

Apparently the biggest obstacle in the way to bumping to newer python is tensorflow 1.x->2.x compatibility. Maybe consider to sovle that. This implementation alone supports python 3.12 with no problem.
This PR modified setup.py and requirements-dev.txt to add tqdm dependency. And it changed torch to the corresponding CUDA-supported version in requirements-dev.txt.
The original benchmark framework is not very friendly to gradient-based methods. This PR added forward() for mlmodel (pytorch target model), to support gradient propagation in an easy-to-understand style. Noted that to protect the computation graph, mlmodel's device is automatically set to be the same as input x. This is an adaptive but less-efficient way, but this is probably the most elegent way before the benchmark framwork is refactored.
autoencoder should not be an abstraction, since not all VAE structures are the same as CCHVAE. This PR decoupled CFVAE with prebuilt autoencoder.
methods.processing.reconstruct_encoding_constraintsis faulty, only supports regrouping binary categorical classes. This PR avoided using this function.
methods.processing.check_counterfactuals is faulty, cannot support batched different negative_label. This PR avoided using this function.

Chenghao-Tan · 2025-10-12T21:17:32Z

This PR is tested by the below script and works well:

from data.catalog import DataCatalog
from evaluation import Benchmark
import evaluation.catalog as evaluation_catalog
from models.catalog import ModelCatalog
from random import seed
from methods import CFVAE

RANDOM_SEED = 54321
seed(
    RANDOM_SEED
)  # set the random seed so that the random permutations can be reproduced again

# load a catalog dataset
data_name = "adult"
dataset = DataCatalog(data_name, "mlp", 0.8)

# load artificial neural network from catalog
model = ModelCatalog(dataset, "mlp", "pytorch")

# get factuals from the data to generate counterfactual examples
factuals = (dataset._df_train).sample(n=10, random_state=RANDOM_SEED)

# load a recourse model and pass black box model
cfvae = CFVAE(model)

# generate counterfactual examples
counterfactuals = cfvae.get_counterfactuals(factuals)

# Generate Benchmark for recourse method, model and data
benchmark = Benchmark(model, cfvae, factuals)
evaluation_measures = [
    evaluation_catalog.YNN(benchmark.mlmodel, {"y": 5, "cf_label": 1}),
    evaluation_catalog.Distance(benchmark.mlmodel),
    evaluation_catalog.SuccessRate(),
    evaluation_catalog.Redundancy(benchmark.mlmodel, {"cf_label": 1}),
    evaluation_catalog.ConstraintViolation(benchmark.mlmodel),
    evaluation_catalog.AvgTime({"time": benchmark.timer}),
]
df_benchmark = benchmark.run_benchmark(evaluation_measures)
print(df_benchmark)

Chenghao-Tan · 2025-10-12T21:19:05Z

python -m methods.catalog.cfvae.reproduce or similiar writing is the way to run the reproduce module.

methods/catalog/cfvae/reproduce.py

methods/api/recourse_method.py

methods/catalog/cfvae/resources/data/adult-val-set.npy

requirements-dev.txt

…t feature-format loss

zkhotanlou

Could you please resolve the reproduction assertion to ensure the script runs as a unit test and aligns with the paper’s reported results?

Chenghao-Tan · 2025-11-07T04:08:24Z

pytest is added (with support of both legacy run command and pytest).
results.csv is updated.

This update also includes a few behaviour changes:

Continuous features reconstruction loss is now scaled with the original num range. (Like in the original code)
Form-loss of categorical features summing to 1. (Like in the original code, and ordinal features are not affected)

Now this implementation is even more loyal to the original code.

zkhotanlou

Thanks for your changes the unit tests pass correctly now! could you resolve the merge conflicts with the new changes on main branch and also please run the pre-commit hooks so that both the checks get approved.

Chenghao-Tan · 2025-11-10T04:48:59Z

requirements-dev.txt

This commit drops torch's GPU support since pip install -r requirements-dev.txt in pre-commit won't read -f https://download.pytorch.org/whl/torch_stable.html and therefore cannot find torch==1.7.0+cu110.

This is a backwards change. Please consider modifying .github/workflows/pre-commit.yaml later, to add -f xxx after pip install xxx, or to simply install torch+cu version manually.

zkhotanlou · 2025-11-17T19:13:02Z

This is an implementation of the "CFVAE"[1] recourse method. The level of reproduction is on level1 as the unit tests checks the implementation could reproduce results reported in the paper for adult dataset on neural network.

[1] Preserving causal constraints in counterfactual explanations for machine learning classifiers
D Mahajan, C Tan, A Sharma - arXiv preprint arXiv:1912.03277, 2019.

feat: Add support for CFVAE

851495e

fix: Fix experiments/*

ed8a6cf

zkhotanlou reviewed Oct 20, 2025

View reviewed changes

methods/catalog/cfvae/reproduce.py Outdated Show resolved Hide resolved

fix: Add results from CFVAE (base) to results.csv

956ba42

Chenghao-Tan force-pushed the main branch from 0c1785e to 956ba42 Compare October 21, 2025 01:04

zkhotanlou reviewed Oct 26, 2025

View reviewed changes

methods/api/recourse_method.py Outdated Show resolved Hide resolved

methods/catalog/cfvae/resources/data/adult-val-set.npy Show resolved Hide resolved

requirements-dev.txt Show resolved Hide resolved

Chenghao Tan and others added 6 commits November 1, 2025 19:32

fix: Fix categorical & continuous feature-specific l1 loss; Add oneho…

e0761b8

…t feature-format loss

feat: Support customize default training process

1ce027a

fix: Fix cfvae (outside of reproduce.py) seed to 10_000_000

c159094

feat: Add tolerance-based comparison for CFVAE reproduction results

0e2f42e

Merge branch 'charmlab:main' into fix--CFVAE-Loss

98f4aec

Merge branch 'charmlab:main' into main

fb16004

zkhotanlou requested changes Nov 3, 2025

View reviewed changes

Chenghao Tan and others added 6 commits November 6, 2025 14:07

Merge branch 'fix--CFVAE-Loss'

7579db6

fix: Reverse type fix in methods/api/recourse_method.py

91c3001

Merge branch 'main' into feat-CFVAE-Support

1bbc46c

fix: Fix logging

cfa1a5a

feat: Apply pytest to reproduce.py

40bcea3

feat: Update CFVAE results.csv

ceadcfa

zkhotanlou reviewed Nov 9, 2025

View reviewed changes

Chenghao Tan and others added 3 commits November 9, 2025 23:33

chore: Format change

3f98e27

fix: Temporarily remove torch gpu support

f4af196

Merge branch 'main' into feat-CFVAE-Support

36ecf3f

Chenghao-Tan commented Nov 10, 2025

View reviewed changes

fix: Fix CFVAE check_counterfactuals

d7dcfc1

zkhotanlou merged commit 36c72aa into charmlab:main Nov 17, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add support for CFVAE#28

feat: Add support for CFVAE#28
zkhotanlou merged 19 commits intocharmlab:mainfrom
Chenghao-Tan:feat-CFVAE-Support

Chenghao-Tan commented Oct 12, 2025

Uh oh!

Chenghao-Tan commented Oct 12, 2025

Uh oh!

Chenghao-Tan commented Oct 12, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zkhotanlou left a comment

Uh oh!

Chenghao-Tan commented Nov 7, 2025

Uh oh!

zkhotanlou left a comment

Uh oh!

Chenghao-Tan Nov 10, 2025 •

edited

Loading

Uh oh!

zkhotanlou commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Chenghao-Tan commented Oct 12, 2025

Add support for CFVAE

Reproducibility

Implementation

Trivials

Uh oh!

Chenghao-Tan commented Oct 12, 2025

Uh oh!

Chenghao-Tan commented Oct 12, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zkhotanlou left a comment

Choose a reason for hiding this comment

Uh oh!

Chenghao-Tan commented Nov 7, 2025

Uh oh!

zkhotanlou left a comment

Choose a reason for hiding this comment

Uh oh!

Chenghao-Tan Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zkhotanlou commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Chenghao-Tan Nov 10, 2025 •

edited

Loading