Hi, following our conversion yesterday, I have two questions about the get_sklearn_wrapper:
1. API: Why do you dynamically create the init and pass the class rather than simply using composition and passing the object?
forecaster = get_sklearn_wrapper(DummyRegressor, lags=3) # you do this
forecaster = get_sklearn_wrapper(DummyRegressor(), lags=3) # why not this?
- This would avoid the issues you seem to have with local class definitions and pickling
- This would allow you to pass composite models like pipelines or GridSearchCV into the wrapper
- I really like the dynamic init creator, but my intuition tells me it's misapplied here.
2. Algorithm: Why not use standard recursive strategy?
from hcrystalball.wrappers import get_sklearn_wrapper
from sklearn.dummy import DummyRegressor
import pandas as pd
import numpy as np
index = pd.date_range("2000", periods=13, freq="Y")
y_train = pd.Series(np.arange(10), index=index[:-3])
X_train = pd.DataFrame(index=y_train.index)
X_test = pd.DataFrame(index=index[-3:])
model = get_sklearn_wrapper(DummyRegressor, lags=3)
model.fit(X_train, y_train)
model.predict(X_test)
# >>> 2010-12-31 7.0
# >>> 2011-12-31 7.0
# >>> 2012-12-31 7.0
# you use the first 3 values as lagged variables, the DummyRegressor simply computes the mean of the rest
# so shouldn't the result be the following?
y_train.iloc[3:].mean() # >>> 6.0
# the problem seems to be in the way you generate the target series
# you increase the gap between lagged variables and target to match the length of
# forecasting horizon
X, y = model._transform_data_to_tsmodel_input_format(X_train, y_train, len(X_test))
pd.concat([X, pd.Series(y, index=X.index, name="y")], axis=1).head()
# lag_0 lag_1 lag_2 y
#5 2.0 1.0 0.0 5
#6 3.0 2.0 1.0 6
#7 4.0 3.0 2.0 7
#8 5.0 4.0 3.0 8
#9 6.0 5.0 4.0 9
Hope this helps!
Hi, following our conversion yesterday, I have two questions about the
get_sklearn_wrapper:1. API: Why do you dynamically create the init and pass the class rather than simply using composition and passing the object?
2. Algorithm: Why not use standard recursive strategy?
Hope this helps!