-
Notifications
You must be signed in to change notification settings - Fork 130
Description
Describe the bug
Unable to use Sklearn pipeline estimators with SplitConfomalRegressor and SplitConformalClassifier. Sample weights are not passed to appropriately pipeline estimator in case of fit in case of CrossConformalRegressor and CrossConfomalClassifier.
To Reproduce
Steps to reproduce the behavior:
import numpy as np
from numpy.typing import NDArray
from sklearn.neural_network import MLPRegressor
from mapie.metrics.regression import regression_coverage_score
from mapie.regression import SplitConformalRegressor
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from mapie.utils import train_conformalize_test_split
RANDOM_STATE = 42
X, y = make_regression(
n_samples=1000, n_features=10, noise=20, random_state=RANDOM_STATE
)
(X_train, X_conformalize, X_test,
y_train, y_conformalize, y_test) = train_conformalize_test_split(
X, y,
train_size=0.8, conformalize_size=0.1, test_size=0.1,
random_state=RANDOM_STATE
)
sample_weight_train = np.random.rand(len(X_train))
pipeline = Pipeline([
('poly_features', PolynomialFeatures(degree=2)),
('linear_regression', LinearRegression())
])
confidence_level = 0.95
mapie_regressor = SplitConformalRegressor(
estimator=pipeline, confidence_level=confidence_level, prefit=False
)
mapie_regressor.fit(X_train, y_train, {"sample_weight": sample_weight_train})
Error:
ValueError: Pipeline.fit does not accept the sample_weight parameter. You can pass parameters to specific steps of your pipeline using the stepname__parameter format, e.g. `Pipeline.fit(X, y, logisticregression__sample_weight=sample_weight)`.
Further in case of CrossConformalRegressor and CrossConformalClassifer sample_weights are completely ignored because EnsembleRegressor and EnsembleClassifier fit's estimator using this function present in utils.py
def _fit_estimator(
estimator: Union[RegressorMixin, ClassifierMixin],
X: ArrayLike,
y: ArrayLike,
sample_weight: Optional[NDArray] = None,
**fit_params,
) -> Union[RegressorMixin, ClassifierMixin]:
fit_parameters = signature(estimator.fit).parameters
supports_sw = "sample_weight" in fit_parameters
if supports_sw and sample_weight is not None:
estimator.fit(X, y, sample_weight=sample_weight, **fit_params)
else:
estimator.fit(X, y, **fit_params)
return estimator
Since pipeline object don't have sample_weight as estimator.fit parameters, so support_sw will always be None in case of pipeline estimator.
Expected behavior
I should be able to pass sample_weights in case of prefit=False case to SplitConformalRegressor, and SplitConfromalClassifer. For CrossConformalRegressor and CrossConformalClassifer I should pass sample_weight to each of the estimator that could use it. Also we should implement test cases for the above classes.
Desktop (please complete the following information):
- OS: Windows 11
- Browser: Chrome
- MAPIE Version: v1.1.0