Skip to content

Weights of Evidence #4

@alexjwang

Description

@alexjwang

Describe the encoding method below. Attach any relevant links that reference the encoding method.
Weights of Evidence (WoE) tells the predictive power of an independent variable in relation to the dependent variable through the formula: $$\text{WoE} = \ln{\frac{\text{Distribution of non-events}}{\text{Distribution of events}}}.$$

WOE is especially useful in certain cases because similar WOE's imply similar categories, which could help with the accuracy of a machine learning algorithm.

Read more about WoE here.

Describe the encoder class method. Any additional functions aside from the essential fit(), transform(), and get_features()?
None for now. May need additional functions in order to integrate with feature calculation.

Describe the encoder primitive for use with Featuretools.
Passes mapping to encoder primitive, which then encodes the column of categoricals.

Describe the use cases in which this encoder would be useful (what kinds of data, high-cardinality, etc.).
Was originally created for use in credit fraud detection. Particularly good for binary situations ("good" and "bad" statuses).

Input type?
possibly sigma, regularization

Output type?
Numeric

List third party libraries required:
category encoders

Describe encoding method's behavior with train, test, and new data.
Similar to other Bayesian encoders. Fit on train, transform with learned mappings on test and new data.

Test cases.
np.nan

Metadata

Metadata

Assignees

No one assigned

    Labels

    New Method IdeaProposal for new encoding method

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions