Weights of Evidence

**Describe the encoding method below. Attach any relevant links that reference the encoding method.**
Weights of Evidence (WoE) tells the predictive power of an independent variable in relation to the dependent variable through the formula: $$\text{WoE} = \ln{\frac{\text{Distribution of non-events}}{\text{Distribution of events}}}.$$

WOE is especially useful in certain cases because similar WOE's imply similar categories, which could help with the accuracy of a machine learning algorithm.

Read more about WoE [here](https://www.listendata.com/2015/03/weight-of-evidence-woe-and-information.html).

**Describe the encoder class method. Any additional functions aside from the essential fit(), transform(), and get_features()?**
None for now. May need additional functions in order to integrate with feature calculation.

**Describe the encoder primitive for use with Featuretools.**
Passes mapping to encoder primitive, which then encodes the column of categoricals.

**Describe the use cases in which this encoder would be useful (what kinds of data, high-cardinality, etc.).**
Was originally created for use in credit fraud detection. Particularly good for binary situations ("good" and "bad" statuses).

**Input type?**
possibly sigma, regularization

**Output type?**
Numeric

**List third party libraries required:**
[category encoders](https://contrib.scikit-learn.org/categorical-encoding/woe.html)

**Describe encoding method's behavior with train, test, and new data.**
Similar to other Bayesian encoders. Fit on train, transform with learned mappings on test and new data.

**Test cases.**
np.nan


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weights of Evidence #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Weights of Evidence #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions