Description
Hey! I finally got around to playing with the examples you have here, and I noticed that you were using shap.kmeans
to get the background data. Since I typically use a random sample not kmeans (unless I am trying to really trying to play with run time optimization), I just swapped
background_distribution = shap.kmeans(xtrain,10)
for
background_distribution = shap.sample(xtrain,10)
When I did this all the adversarial results for SHAP seemed to fall apart for COMPAS...meaning 79% of the time race is still the top SHAP feature in the test dataset for the adversarial model.
This very strong dependence on using kmeans was surprising to me, since it seems to imply SHAP is much more robust to these adversarial attacks when using a typical random background sample. Have you noticed this before, or do you have any thoughts on this? I think it is worth pointing out, but I wanted to get your feedback before suggesting to users that a random sample provides better adversarial robustness.
Thanks!