Skip to content

disparate impact remover has no effect on disparate impact metric #547

Open
@JorritMontijn

Description

@JorritMontijn

First of all, I'd like to thank you all for a lot of work that has gone into this package! I hope you could help me with the following problem. I'm using the R interface and after some initial problems getting it set up (the default installation has incompatible versions of python and tensorflow), I can access the AIF360 functions now. However, either the documentation is unclear as to how I should use it, or the function disparate_impact_remover is broken.

In the following code block I'm repairing some data, but the data with full repair is identical to the data without repair:

load_aif360_lib()
ad <- adult_dataset()
p <- list("race", 1)
u <- list("race", 0)

#subselect
pd_conv = ad$convert_to_dataframe()
data0 = pd_conv[[1]]
data0sub = data0[,c('race','age','sex','income-per-year')]

#turn into AIF data frame
aif_df = binary_label_dataset(
  data_path = data0sub,
  favor_label=1, unfavor_label=0, 
  unprivileged_protected_attribute=1, 
  privileged_protected_attribute=0,
  target_column='income-per-year', protected_attribute='race')

#repair
di1 <- disparate_impact_remover(repair_level = 1.0, sensitive_attribute = "race")
rp1 <- di1$fit_transform(aif_df)

di2 <- disparate_impact_remover(repair_level = 0, sensitive_attribute = "race")
rp2 <- di2$fit_transform(aif_df)


#calc metric
bm1 = binary_label_dataset_metric(rp1, list('race', 1), list('race',0))
fl_disparate_impact1 = bm1$disparate_impact()


#calc metric
bm2 = binary_label_dataset_metric(rp2, list('race', 1), list('race',0))
fl_disparate_impact2 = bm2$disparate_impact()

> fl_disparate_impact1
[1] 0.6037688
> fl_disparate_impact2
[1] 0.6037688

Note that the subselection isn't strictly necessary, but I wanted to make sure there was no error in transforming the data sets between R data frames and the AIF360 format, as I initially noticed this problem in my own data set.

So my question is: am I doing something wrong, or are these functions broken?

Thank you in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions