Skip to content

Commit 7a1e283

Browse files
authored
feature tags (#625)
1 parent 47c29a5 commit 7a1e283

File tree

5 files changed

+9
-5
lines changed

5 files changed

+9
-5
lines changed

docs/user-guide/feature-selection.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
## Maximum Relevance Minimum Redundancy
44

5+
!!! info "New in version 0.8.0"
6+
57
The [`Maximum Relevance Minimum Redundancy`][MaximumRelevanceMinimumRedundancy-api] (MRMR) is an iterative feature selection method commonly used in data science to select a subset of features from a larger feature set. The goal of MRMR is to choose features that have high *relevance* to the target variable while minimizing *redundancy* among the already selected features.
68

79
MRMR is heavily dependent on the two functions used to determine relevace and redundancy. However, the paper [Maximum Relevanceand Minimum Redundancy Feature Selection Methods for a Marketing Machine Learning Platform](https://arxiv.org/pdf/1908.05376.pdf) shows that using [f_classif](https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.f_classif.html) or [f_regression](https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.f_regression.html) as relevance function and Pearson correlation as redundancy function is the best choice for a variety of different problems and in general is a good choice.
@@ -57,7 +59,7 @@ Feature selection method: mrmr_smile
5759
F1 score: 0.849
5860
```
5961

60-
The MRMR feature selection model provides better results compared against the other methods, although the smile technique performs rather good as well.
62+
The MRMR feature selection model provides better results compared against the other methods, although the smile technique performs rather good as well.
6163

6264
Finally, we can take a look at the selected features.
6365

docs/user-guide/meta-models.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ Note that these predictions seems to yield the lowest error but take it with a g
136136

137137
### Specialized Estimators
138138

139-
!!! info "New in version 0.7.5"
139+
!!! info "New in version 0.8.0"
140140

141141
Instead of using the generic `GroupedPredictor` directly, it is possible to work with _task specific_ estimators, namely: [`GroupedClassifier`][grouped-classifier-api] and [`GroupedRegressor`][grouped-regressor-api].
142142

pyproject.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "scikit-lego"
7-
version = "0.7.4"
7+
version = "0.8.0"
88
description="A collection of lego bricks for scikit-learn pipelines"
99

1010
license = {file = "LICENSE"}

sklego/feature_selection/mrmr.py

+2
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,8 @@ class MaximumRelevanceMinimumRedundancy(SelectorMixin, BaseEstimator):
8383
8484
- np.ndarray, shape = (len(left), ), The array containing the redundancy score using the custom function.
8585
86+
!!! info "New in version 0.8.0"
87+
8688
Parameters
8789
----------
8890
k : int

sklego/meta/grouped_predictor.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -397,7 +397,7 @@ class GroupedRegressor(GroupedPredictor, RegressorMixin):
397397
Its spec is the same as [`GroupedPredictor`][sklego.meta.grouped_predictor.GroupedPredictor] but it is available
398398
only for regression models.
399399
400-
!!! info "New in version 0.7.5"
400+
!!! info "New in version 0.8.0"
401401
"""
402402

403403
def fit(self, X, y):
@@ -434,7 +434,7 @@ class GroupedClassifier(GroupedPredictor, ClassifierMixin):
434434
Its equivalent to [`GroupedPredictor`][sklego.meta.grouped_predictor.GroupedPredictor] with `shrinkage=None`
435435
but it is available only for classification models.
436436
437-
!!! info "New in version 0.7.5"
437+
!!! info "New in version 0.8.0"
438438
"""
439439

440440
def __init__(

0 commit comments

Comments
 (0)