Extend one hot encoding functionalities

[sklearn.preprocessing.OneHotEncoder](https://scikit-learn.org/stable/modules/classes.html#module-sklearn.preprocessing) exposes multiple parameters such as: `drop`, `handle_unknown`, etc. which are useful to avoid overparametrisation (e.g. overparametrisation of binary variables, dummy trap for linear regression etc.) or handle unknown values.

Currently there exist some explainers that know how to deal with one-hot encoded features such as: [AnchorTabular](https://github.com/SeldonIO/alibi/blob/e790f52a2b19236cb3beb8be90962f2116057f8d/alibi/explainers/anchors/anchor_tabular.py#L589), [Counterfactuals](https://github.com/SeldonIO/alibi/blob/e790f52a2b19236cb3beb8be90962f2116057f8d/alibi/explainers/cfproto.py#L47). Some utility function used by those are available [here](https://github.com/SeldonIO/alibi/blob/f0aed3a4970e3c39e8ca3b91fbc2e7b13cb6bb65/alibi/utils/mapping.py#L86). The limitation of using the `ohe` flag (i.e. passing ohe dataset) is that the explainers don't know how to deal with the cases mentioned above/ 

It would be good to see if we can extend the capabilities of our explainers and our utility functions to deal with all the arguments available in [sklearn.preprocessing.OneHotEncoder](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html). Otherwise, it would be preferable to mention explicitly in our documentation what "format" the ohe is expected to be in (i.e. full representation only).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extend one hot encoding functionalities #718

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Extend one hot encoding functionalities #718

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions