Skip to content

PGM-Lab/2026-PGM-data-uncertainty

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Accounting for Data Uncertainty in Counterfactual Reasoning

This bundle contains the manuscript submitted at PGM-2026 and entitled "Accounting for Data Uncertainty in Counterfactual Reasoning". The organisation is the following:

  • examples: a toy example for running the method proposed in the paper.
  • ctfzeros: python sources implementing the method.
  • models: set of structural causal models in UAI format considered in the experimentation.
  • requirements.txt: code dependencies.

Setup

First of all, check the Python version. This sources have been coded with the following Python version:

!python --version
Python 3.11.13

Then, install the dependencies in the requirement.txt file. The main dependency is the python packege bcause (https://github.com/PGM-Lab/bcause).

!pip install --upgrade pip setuptools wheel
!pip install -r ./requirements.txt
!pip install polytope~=0.2.5 --no-deps

Model and data

In this repository, we provide functionality for preprocessing the model and data so they could work we our inference algorithm:

from ctfzeros.prepro import load_and_preprocess
filepath = "./models/simple_nparents1_nzr10_zdr00_0.uai"
datapath = "./models/simple_nparents1_nzr10_zdr00_0.csv"

model, data, _, _ = load_and_preprocess(filepath, datapath)
model
<StructuralCausalModel (Y:2,X1:2|Uy:4,Ux1:2), dag=[Uy][Y|Uy:X1][X1|Ux1][Ux1]>
model.draw()

Inference Output

data
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
X1 Y
0 0 1
1 0 1
2 0 0
3 0 1
4 1 1
... ... ...
995 0 0
996 1 0
997 0 0
998 0 1
999 1 0

1000 rows × 2 columns

Counterfactual inference

First, load corresponding modules for using LPID and DC3ID:

from ctfzeros.imprecise_empirical import LPCC_imprecise_empirical, DCCC_imprecise_empirical

Set up the LPID inference engine with a perturbation $\epsilon=0.05$. Then calculate the probability of sufficiency $PS(X_1,Y)$:

infLPID = LPCC_imprecise_empirical(model, data, perturbation=0.05)
infLPID.prob_sufficiency("X1", "Y")
[7.724454227344187e-20, 0.4317658471578703]

Similarly, with the divide and conquer approach (DC3ID):

infDC3ID = DCCC_imprecise_empirical(model, data, perturbation=0.05)
infDC3ID.prob_sufficiency("X1", "Y")
[0.0, 0.4317658471578494]

Instead of the interval, we can obtain the list of individual queries:

infDC3ID.set_interval_result(False)
infDC3ID.prob_sufficiency("X1", "Y")
[0.3702430867734918,
 0.2277520565325892,
 0.4317658471578494,
 0.2655972876838508,
 0.0,
 0.0,
 0.0,
 0.0]

Finally, we can do the inference with a reduced number of solutions (num_runs=5) which can lead to an approximation:

infDC3ID = DCCC_imprecise_empirical(model, data, perturbation=0.05, num_runs=5)
infDC3ID.set_interval_result(False)
infDC3ID.prob_sufficiency("X1", "Y")
[0.22775205653259195,
 0.2655972876838502,
 0.3702430867734911,
 0.4317658471578494,
 0.0]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages