-
Notifications
You must be signed in to change notification settings - Fork 40
Open
Labels
bugSomething isn't workingSomething isn't workingv5Issue/PR related to Optuna version 5.Issue/PR related to Optuna version 5.
Description
Expected behavior
OptunaSearchCV executed on sklearn Pipeline with multiple ColumnTransformers crashes when OptunaSearchCV's n_jobs > 1 and the transformers argument of ColumnTransformers references column names. But works fine when n_jobs = 1 or when transformers argument of ColumnTransformers references column indices.
Environment
- OS: Windows-10-10.0.19045-SP0
- Python version: 3.11.7
- Optuna version: 3.6.1
- Optuna Integration version: 3.6.0
- Sklearn version: 1.5.0
- Pandas version: 2.1.4
Error messages, stack traces, or logs
C:\Users\SC13015\AppData\Local\Temp\ipykernel_19552\2556591622.py:1: ExperimentalWarning: OptunaSearchCV is experimental (supported from v0.17.0). The interface can change in the future.
model = OptunaSearchCV(
[I 2024-08-01 18:57:31,427] A new study created in memory with name: no-name-55d6303b-7b50-4255-af29-7f37bb81988a
C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\optuna_integration\sklearn.py:377: RuntimeWarning: Mean of empty slice
trial.set_user_attr("mean_{}".format(name), np.nanmean(array))
C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\numpy\lib\nanfunctions.py:1879: RuntimeWarning: Degrees of freedom <= 0 for slice.
var = nanvar(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
[W 2024-08-01 18:57:31,657] Trial 0 failed with parameters: {'est__alpha': 0.2564150272852753, 'est__l1_ratio': 0.946852427396853} because of the following error: The value nan is not acceptable.
C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\optuna_integration\sklearn.py:377: RuntimeWarning: Mean of empty slice
trial.set_user_attr("mean_{}".format(name), np.nanmean(array))
C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\numpy\lib\nanfunctions.py:1879: RuntimeWarning: Degrees of freedom <= 0 for slice.
var = nanvar(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
[W 2024-08-01 18:57:31,664] Trial 1 failed with parameters: {'est__alpha': 0.19084011171917495, 'est__l1_ratio': 0.05273897241757375} because of the following error: The value nan is not acceptable.
[W 2024-08-01 18:57:31,693] Trial 1 failed with value nan.
C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\optuna_integration\sklearn.py:377: RuntimeWarning: Mean of empty slice
trial.set_user_attr("mean_{}".format(name), np.nanmean(array))
C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\numpy\lib\nanfunctions.py:1879: RuntimeWarning: Degrees of freedom <= 0 for slice.
var = nanvar(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
[W 2024-08-01 18:57:31,665] Trial 0 failed with value nan.
[W 2024-08-01 18:57:31,705] Trial 2 failed with parameters: {'est__alpha': 0.0025768190494916683, 'est__l1_ratio': 0.9202702982029136} because of the following error: The value nan is not acceptable.
[W 2024-08-01 18:57:31,740] Trial 2 failed with value nan.
C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\optuna_integration\sklearn.py:377: RuntimeWarning: Mean of empty slice
trial.set_user_attr("mean_{}".format(name), np.nanmean(array))
C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\numpy\lib\nanfunctions.py:1879: RuntimeWarning: Degrees of freedom <= 0 for slice.
var = nanvar(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
[W 2024-08-01 18:57:31,749] Trial 8 failed with parameters: {'est__alpha': 71.11084264916614, 'est__l1_ratio': 0.6008590162012909} because of the following error: The value nan is not acceptable.
C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\optuna_integration\sklearn.py:377: RuntimeWarning: Mean of empty slice
trial.set_user_attr("mean_{}".format(name), np.nanmean(array))
C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\numpy\lib\nanfunctions.py:1879: RuntimeWarning: Degrees of freedom <= 0 for slice.
var = nanvar(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
[W 2024-08-01 18:57:31,758] Trial 3 failed with parameters: {'est__alpha': 0.002797310249235216, 'est__l1_ratio': 0.9280090705873864} because of the following error: The value nan is not acceptable.
[W 2024-08-01 18:57:31,759] Trial 4 failed with parameters: {'est__alpha': 25.786889650390023, 'est__l1_ratio': 0.5543902689675048} because of the following error: The value nan is not acceptable.
[W 2024-08-01 18:57:31,759] Trial 8 failed with value nan.
[W 2024-08-01 18:57:31,761] Trial 5 failed with parameters: {'est__alpha': 370.9566773707335, 'est__l1_ratio': 0.63458459809165} because of the following error: The value nan is not acceptable.
[W 2024-08-01 18:57:31,763] Trial 6 failed with parameters: {'est__alpha': 0.13413828959725388, 'est__l1_ratio': 0.18413969872571734} because of the following error: The value nan is not acceptable.
[W 2024-08-01 18:57:31,765] Trial 7 failed with parameters: {'est__alpha': 0.109174090958778, 'est__l1_ratio': 0.002713607083478675} because of the following error: The value nan is not acceptable.
[W 2024-08-01 18:57:31,766] Trial 3 failed with value nan.
[W 2024-08-01 18:57:31,770] Trial 9 failed with parameters: {'est__alpha': 0.030474994200691385, 'est__l1_ratio': 0.1817844394988457} because of the following error: The value nan is not acceptable.
[W 2024-08-01 18:57:31,782] Trial 9 failed with value nan.
[W 2024-08-01 18:57:31,773] Trial 5 failed with value nan.
[W 2024-08-01 18:57:31,776] Trial 6 failed with value nan.
[W 2024-08-01 18:57:31,779] Trial 7 failed with value nan.
[W 2024-08-01 18:57:31,770] Trial 4 failed with value nan.
No trials are completed yet.
Traceback (most recent call last):
File "C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\optuna_integration\sklearn.py", line 820, in _refit
self.best_estimator_.set_params(**self.study_.best_params)
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\optuna\study\study.py", line 114, in best_params
return self.best_trial.params
^^^^^^^^^^^^^^^
File "C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\optuna\study\study.py", line 157, in best_trial
return copy.deepcopy(self._storage.get_best_trial(self._study_id))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\SC13015\AppData\Local\Anaconda3\Lib\site-packages\optuna\storages\_in_memory.py", line 234, in get_best_trial
raise ValueError("No trials are completed yet.")
ValueError: No trials are completed yet.Steps to reproduce
import pandas as pd
from sklearn import set_config
from sklearn.datasets import load_iris
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.linear_model import ElasticNet
from optuna_integration import OptunaSearchCV
from optuna.distributions import FloatDistribution
iris = load_iris()
X = pd.DataFrame(iris['data'], columns=iris['feature_names'])
y = pd.Series(iris['target']).rename('iris type')
set_config(transform_output='pandas')
transf_params = dict(
remainder='passthrough',
verbose_feature_names_out=False,
force_int_remainder_cols=False,
)
# Works only when n_jobs=1
sc1_cols = ['sepal length (cm)', 'sepal width (cm)']
sc2_cols = ['petal length (cm)', 'petal width (cm)']
# Works with any n_jobs
# sc1_cols = [0, 1]
# sc2_cols = [2, 3]
pipe = Pipeline([
('sc1', ColumnTransformer([('sc1', StandardScaler(), sc1_cols)], **transf_params)),
('sc2', ColumnTransformer([('sc2', MinMaxScaler(), sc2_cols)], **transf_params)),
('est', ElasticNet())
])
param_distributions = {
'est__alpha': FloatDistribution(1e-3, 1e3, log=True),
'est__l1_ratio': FloatDistribution(0, 1),
}
model = OptunaSearchCV(
estimator=pipe,
param_distributions=param_distributions,
n_trials=10,
n_jobs=-1,
)
model.fit(X, y)Additional context (optional)
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingv5Issue/PR related to Optuna version 5.Issue/PR related to Optuna version 5.