This benchmark framework evaluates the performance of various Conditional Independence (CI) tests, primarily those in pgmpy. CI tests may perform differently depending on the data-generating mechanism (DGM), sample size, variable types, effect size, and the complexity of the conditioning set. algo-benchmarks helps users and developers:
- Compare CI tests under standardized, reproducible settings.
- Select the best CI test for their data and use case.
- Contribute new tests or data-generating mechanisms for further comparison.
Benchmark results are saved to CSV files, which can be used to generate plots or for further analysis.
- Python 3.8+
- Clone this repository (
algo-benchmarks) - Install pgmpy (either latest release or in editable mode)
Install required dependencies:
pip install -r requirements.txtIf you want to use a development version of pgmpy:
git clone https://github.com/pgmpy/pgmpy.git
cd pgmpy
pip install -e .[tests]Return to your algo-benchmarks directory before running benchmarks.
From the root directory of algo-benchmarks, run:
python -m PY_Scripts.CI_BenchmarksThis will:
- Run each CI test on each DGM for various sample sizes, conditioning set sizes, and effect sizes.
- Output detailed and summary CSV files (
ci_benchmark_raw_result.csv,ci_benchmark_summaries.csv).
To add your own DGM:
- Define a function in
PY_Scripts/data_generating_mechanisms.py:def my_custom_dgm(n_samples, effect_size=1.0, n_cond_vars=1, seed=None, dependent=True): # return a pandas.DataFrame with columns like ['X', 'Y', 'Z1', ...]
- Register it in the DGM registry in that file:
DGP_REGISTRY["my_custom"] = my_custom_dgm
- Add
"my_custom"to theDGM_TO_CITESTSmapping inCI_Benchmarks.pyif you want it benchmarked.
You get two main files after running the benchmark:
ci_benchmark_raw_result.csv: All individual benchmark runs.ci_benchmark_summaries.csv: Aggregated summary statistics.
| Column | Description |
|---|---|
| dgm | Data Generating Mechanism used |
| sample_size | Number of samples |
| n_cond_vars | Number of conditioning variables |
| effect_size | Numeric effect size (0 = null, >0 = alt) |
| repeat | Repetition index |
| ci_test | CI test used (e.g., pearsonr, gcm, pillai) |
| dependent | True if X and Y are dependent, False otherwise |
| p_value | The test's p-value |
| Column | Description |
|---|---|
| dgm | Data Generating Mechanism used |
| sample_size | Number of samples |
| n_cond_vars | Number of conditioning variables |
| effect_size | Effect size |
| ci_test | CI test used |
| significance_level | Significance threshold used |
| type1_error | False positive rate |
| type2_error | False negative rate |
| power | 1 - type2_error |
| N_null | Number of null runs |
| N_alt | Number of alt runs |
- Edit
benchmarks/DGM.pyand define your function. - Add it to the
DGP_REGISTRYdictionary. - Optionally, add it to the
DGM_TO_CITESTSmapping inci_benchmarks.py.
- Implement your test as a function (compatible with pgmpy’s CI test callable signature).
- Register it in the
ci_testsdictionary inCI_Benchmarks.py. - Add it to the list for relevant DGMs in
DGM_TO_CITESTS.
You can create plots from the summary CSV using pandas, matplotlib, or seaborn.
See the plotting functions in CI_Benchmarks.py for example usage.
- Please add tests for any new functionality.
- Follow the code style used in pgmpy and this repo.
- Document any new DGMs or CI tests in this file.
- pgmpy documentation
- pgmpy/pgmpy#2150
- Relevant academic papers as needed.