feat: Add support for CFRL (and dataset/dependency fixes) by Chenghao-Tan · Pull Request #36 · charmlab/recourse_benchmarks

Chenghao-Tan · 2025-11-19T22:15:59Z

Add support for CFRL

Reproducibility

Level-1 reproducibility is achieved with this implementation (Random Forest on Adult), and I'm personally confident about raising it to level-2 (by adding more target models).

Implementation

This implementation is largely based on Sheldon.IO's official repo alibi, especially the RL Agent part, which locates in cfrl_*.py However, Autoencoder (based on Adult AE in the paper) part (based on Adult AE in the paper) is mostly done from scratch, which locates in model.py. The authors didn't provide seeds.

Here are the details:

Code used in CFRL are aggregated together into three files, for simplicity.
Since official dependency is a lot different than ours, tensorflow (which is easily affected by version changes) support is removed.
Official PyTorch support needs torch>=1.8. nn.LazyLinear is backported using nn.Linear for compatibility with torch<1.8, and input_dim is inferred from forward process since there might not be a closed-form number due to arbitrary operations in the post-process. This can work as fallback, which means, if stop passing input_dim and nn.LazyLinear support is detected, this implementation shall still use nn.LazyLinear.
To support the feature constraints and be more loyal to the official code, a data format adaptor was introduced. See model.py -> _ordered_to_cfrl() and _cfrl_to_ordered(). It will convert onehot+normalized data to raw data (and back). It will read metadata by running loadDataset() again (since metadata is dropped by DataCatalog). An unit test / visualizer which supports both python -m pytest methods/catalog/cfrl/dataset_adaptor_test.py (unit test) and python -m methods.catalog.cfrl.dataset_adaptor_test (visualize) is provided. Welcome to test it on different datasets and transplant it to other methods that requires raw data / metadata.
Hyperparameters used are the ones from the paper, not the example on Sheldon.IO's document website.

Fixes

This PR also does some general fixes for the framework's base components.

At some point, loadData.py was moved from /data to /data/catalog. Old cached onehot dataset pickles embedded the path of /data/loadData.py, which is not true anymore, so if it try to load a dataset using that pickle, it will fall back to load the raw data, which is not in /data anymore either (in /data/raw_data now). Then failure occurs and stop run_experiments.py. With this PR, all affected dataset loading scripts (adult, compass, credit) are fixed and all cached dataset pickles are updated.
run_experiments.py's comments and default cli args are now more consistent. It will by default run all supported methods, avoiding unexpected behaviours.
Fixed incompatibility between werkzeug, itsdangerous and Flask==1.1.2 (can be triggered by pytest run). pytest shall run out of the box now. Dependencies are also updated to make setup.py (install as a package) and requirements-dev.txt (manually install) more consistent.

Trivials

Note that with seed=54321 (run_experiments.py's default) and run credit dataset only with linear classifier, the classifier will predict all 1, so there's no negative case for recourse, then credit will be jumped (the current design will only do 0->1 flip). It's possibly the reason why run_experiments.py is jumping datasets. Although by re-running, it will possibly act normally, this "normal" behaviour can disturb experiment immutability. (Pseudo random numbers varies to the generation order, even when random seed is fixed.)

… from alibi documents

…er torch<1.8

…1.2 (triggered by pytest)

zkhotanlou

Thank you for this complete implementation and the helpful fixing issues. Just please address the minor comments remaining, otherwise it's ready to merge.

requirements-dev.txt

methods/catalog/cfrl/reproduce.py

methods/catalog/cfrl/dataset_adaptor_test.py

…est.py

zkhotanlou · 2025-11-22T23:59:55Z

This is an implementation of the "CFRL"[1] recourse method. The level of reproduction is on level1 as the unit tests check the implementation could reproduce the results reported in the paper for Random Forest on Adult dataset.

[1] Samoilescu, R. F., Van Looveren, A., & Klaise, J. (2021). Model-agnostic and scalable counterfactual explanations via reinforcement learning. arXiv preprint arXiv:2106.02597.

Chenghao Tan and others added 17 commits November 1, 2025 23:51

feat: Initial support for CFRL

9204620

fix: Replicate official Keras autoencoder with PyTorch

0e3ddd2

fix: Backport LazyLinear; Fix explain format; Add simple reproduce.py…

6731423

… from alibi documents

Merge branch 'charmlab:main' into feat--CFRL-Support

dad7292

fix: Fix logging (cfvae&cfrl)

b71b949

feat: Add reproduce.py

5d77ad6

Merge branch 'main' into feat--CFRL-Support

e255b19

fix: Use target_steps for autoencoder training as in the paper

79c524c

fix: Fix CFRL compatibility with sklearn/xgboost/tensorflow model und…

0d4319f

…er torch<1.8

fix: Fix CFRL check_counterfactuals

0121230

fix: Fix datasets loading

841a464

feat: Add CFRL results

81fc70f

fix: Fix incompatibility between werkzeug, itsdangerous and Flask==1.…

7a04ecd

…1.2 (triggered by pytest)

feat: Add pytest for CFRL

94c3ad8

chore: CFRL pre-commit format change

d6a41e0

Merge remote-tracking branch 'origin/main' into feat--CFRL-Support

8bb79bc

chore: CFRL pre-commit format change

984004b

zkhotanlou reviewed Nov 21, 2025

View reviewed changes

requirements-dev.txt Show resolved Hide resolved

methods/catalog/cfrl/reproduce.py Show resolved Hide resolved

methods/catalog/cfrl/dataset_adaptor_test.py Show resolved Hide resolved

Chenghao Tan and others added 2 commits November 22, 2025 13:14

fix: (CFRL) Replace DummyModel with ModelCatalog in dataset_adaptor_t…

6fbc427

…est.py

Merge branch 'main' into feat--CFRL-Support

9d136e1

zkhotanlou merged commit 7140449 into charmlab:main Nov 23, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add support for CFRL (and dataset/dependency fixes)#36

feat: Add support for CFRL (and dataset/dependency fixes)#36
zkhotanlou merged 19 commits intocharmlab:mainfrom
Chenghao-Tan:feat--CFRL-Support

Chenghao-Tan commented Nov 19, 2025 •

edited

Loading

Uh oh!

zkhotanlou left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zkhotanlou commented Nov 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Chenghao-Tan commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add support for CFRL

Reproducibility

Implementation

Fixes

Trivials

Uh oh!

zkhotanlou left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zkhotanlou commented Nov 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Chenghao-Tan commented Nov 19, 2025 •

edited

Loading