You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
after installing requirements.txt; otherwise, the algorithm will run slower. However, this is not supported on MacOS and may fail on some Windows devices.
To reproduce the experiments in the paper, run ```experiments/run_folktables.py``` with the dataset name, algorithm name and hyperparameters as command line arguments, like below:
41
-
```run_folktables.py --algorithm sslalm --state OK --task income --constraint loss --loss_bound 0.005 --num_exp 10 --time 30 --batch_size 8 -mu 2. -rho 1. -tau 0.01 -eta 5e-2 -beta 0.5```
42
-
This will start 10 runs of the SSL-ALM algorithm, 30 seconds each, and save the model and the results in the ```experiments/utils/saved_models``` and ```experiments/utils/exp_results``` folders.
43
54
44
55
The benchmark comprises the following algorithms:
45
56
- Stochastic Ghost [[2]](#2),
46
57
- SSL-ALM [[3]](#3),
47
58
- Stochastic Switching Subgradient [[4]](#4).
48
59
60
+
To reproduce the experiments of the paper, run the following:
python run_folktables.py data=folktables alg=sgd # baseline, no fairness
68
+
python run_folktables.py data=folktables alg=fairret # baseline, fairness with regularizer
69
+
```
70
+
Each command will start 10 runs of the `alg`, 30 seconds each.
71
+
The results will be saved to `experiments/utils/saved_models` and `experiments/utils/exp_results`.
72
+
<!-- In the repository, we include the configuration needed to reproduce the experiments in the paper. To do so, go to `experiments` and run `python run_folktables.py data=folktables alg=sslalm`. -->
73
+
<!-- Repeat for the other algorithms by changing the `alg` parameter. -->
74
+
75
+
This repository uses [Hydra](https://hydra.cc/) to manage parameters; see `experiments/conf` for configuration files.
76
+
* To change the parameters of the experiment, such as the number of runs for each algorithm, run time, the dataset used (*note: for now supports only Folktables*) - use `experiment.yaml`.
77
+
* To change the dataset settings - such as file location - or do dataset-specific adjustments, use `data/{dataset_name}.yaml`
78
+
* To change algorithm hyperparameters, use `alg/{algorithm_name}.yaml`.
79
+
* To change constraint hyperparameters, use `constraint/{constraint_name}.yaml`
80
+
81
+
<!-- ; it is installed as one of the dependencies. -->
82
+
<!-- To learn more about using Hydra, please check out the [official tutorial](https://hydra.cc/docs/tutorials/basic/your_first_app). -->
83
+
49
84
### Producing plots
50
85
The plots and tables like the ones in the paper can be produced using the two notebooks. `experiments/algo_plots.ipynb` houses the convergence plots, and `experiments/model_plots.ipynb` - all the others.
51
86
52
-
**Warning**: As of 16/05, Folktables seems to be unable to connect to the American census servers. This means that downloading the dataset through the code is not possible. Manual download requires two files: the .csv dataset, at https://www2.census.gov/programs-surveys/acs/data/pums/`{year}`/`{horizon}`, and the corresponding .csv description, at https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/; use the flag ```--no-download```. By default, the files will be placed in `experiments/utils/raw_data/{task}/{year}/{horizon}` (e.g. `experiments/utils/raw_data/income/2018/1-Year/{filename}.csv`). A custom path can be specified with the --data_path argument, but it has to have the form `*/{year}/{horizon}/`.
87
+
## Extending the benchmark
53
88
54
-
## Extending the benchmark
89
+
**To add a new algorithm**, you can subclass the ```Algorithm``` class. Before you can run it, you will need to follow these steps:
90
+
1. In the `experiments/conf/alg` folder, add a `.yaml` file with `import_name: {ClassName}` (so the code knows which algorithm to import) and the desired keyword parameter values under `params`:
55
91
56
-
To add a different constraint formulation, you can use the ```FairnessConstraint``` class by passing your callable function to the constructor as ```fn```.
57
-
To add a new algorithm, you can subclass the ```Algorithm``` class.
92
+
```
93
+
import_name: ClassName
94
+
95
+
params:
96
+
param_name_1: value
97
+
param_name_2: value
98
+
```
99
+
100
+
2. In `src/algorithms/__init__.py`, add `from .{filename} import {ClassName}` (so the code is able to import it).
101
+
102
+
Now you can run the algorithm by executing `python run_folktables.py data=folktables alg={yaml_file_name}`, or by changing the experiment config files.
103
+
104
+
**To add a different constraint formulation**, you can use the `FairnessConstraint` class by passing your callable function to the constructor as `fn`. If you use `run_folktables.py`, you can add a new constraint function by following the steps:
105
+
106
+
1. Add a `.yaml` file with `import_name: {FunctionName}`, along with the desired batch size and bound (*to be reworked for more generality*), to the `experiments/conf/constraint` folder
107
+
2. Import it in `src/constraints/__init__.py` as in step 2 above.
108
+
109
+
Now, to run the code with your constraint, use the `constraint` field in the main config.
58
110
59
111
## License and terms of use
60
112
@@ -80,6 +132,11 @@ For more information, see https://www.census.gov/data/developers/about/terms-of-
80
132
<!-- } -->
81
133
<!-- ``` -->
82
134
135
+
## Future work
136
+
137
+
- Add support for fairness constraints with >=2 subgroups (limitation of the code, not of the algorithms)
138
+
- Add support to datasets besides Folktables
139
+
- Move towards a more PyTorch-like API for optimizers
0 commit comments