Skip to content

Commit 4ddf633

Browse files
authored
0.1.0 (#5)
* steamy Rotorua Expanding capabilities of Sklearn Rolling model * pestilence * stumped on tsfresh and wading through statsmodels * regressor bugs squished, for now * GLM, sorta maybe * GLM touchup * Abel Tasman * Sklearn expanded and parameters to VARMAX * 0.1.0 on stormy seas
1 parent a492c44 commit 4ddf633

File tree

12 files changed

+1072
-79
lines changed

12 files changed

+1072
-79
lines changed

README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# AutoTS
22

3-
### Project CATS (Catlin Automated Time Series)
4-
(or maybe eventually: Clustered Automated Time Series)
3+
### Project CATS
4+
55
#### Model Selection for Multiple Time Series
66

77
Simple package for comparing and predicting with open-source time series implementations.
@@ -26,7 +26,7 @@ from autots.datasets import load_toy_monthly # also: _daily _yearly or _hourly
2626
df_long = load_toy_monthly()
2727
2828
from autots import AutoTS
29-
model = AutoTS(forecast_length = 14, frequency = 'infer',
29+
model = AutoTS(forecast_length = 3, frequency = 'infer',
3030
prediction_interval = 0.9, ensemble = True, weighted = False,
3131
max_generations = 5, num_validations = 2, validation_method = 'even')
3232
model = model.fit(df_long, date_col = 'datetime', value_col = 'value', id_col = 'series_id')
@@ -68,6 +68,7 @@ AutoTS works in the following way at present:
6868
#### Requirements
6969
fbprophet
7070
fredapi (example datasets)
71+
tsfresh
7172

7273
Check out `functional_environments.md` for specific versions tested to work.
7374

TODO.md

Lines changed: 18 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,21 @@
11
# To-Do
2+
Single time series
3+
Better point to probabilistic (uncertainty of naive last-value forecast) - linear reg of abs error of samples
4+
Better X_maker for Rolling Sklearn
5+
Sklearn Holiday not working
6+
Possible error where first template model is invalid, 'smape_weighted' doesn't exist error
27
* Recombine best two of each model parameters, if two or more present
38
* Inf appearing in MAE and RMSE (possibly all NaN in test)
49
* Na Tolerance for test in simple_train_test_split
510
* min_allowed_train_percent into higher-level API
6-
* annual data with different dates of the record 6/30, 1/1, 12/30
711
* Relative/Absolute Imports and reduce package reloading
8-
* User regressor to sklearn model regression_type
12+
* User regressor to sklearn model regression_type (added, needs testing)
913
* Weekly sample data
1014
* Format of Regressor - allow multiple input to at least sklearn models
1115
* 'Age' regressor as an option in addition to User/Holiday
12-
* Handle categorical forecasts where forecast leaves range of known values
13-
* Detrend transformer doesn't work on some indexes
16+
* Handle categorical forecasts where forecast leaves range of known values, then add to upper/lower forecasts
1417
* Speed improvements, Profiling, Parallelization, and Distributed options for general greater speed
15-
* Generate list of functional frequences, and improve usability on rarer frequenices
18+
* Improve usability on rarer frequenices
1619
* Warning/handling if lots of NaN in most recent (test) part of data
1720
* Figures: Add option to output figures of train/test + forecast, other performance figures
1821
* Pre-clustering on many time series
@@ -21,6 +24,7 @@
2124
* Hierachial correction (bottom-up to start with)
2225
* Improved verbosity controls and options. Replace most 'print' with logging.
2326
* Export as simpler code (as TPOT)
27+
* set up the lower-level API to be usable as pipelines
2428
* AIC metric, other accuracy metrics
2529
* Analyze and return inaccuracy patterns (most inaccurate periods out, days of week, most inaccurate series)
2630
* Used saved results to resume a search partway through
@@ -31,6 +35,12 @@
3135
* More thorough use of setting random seed
3236
* For monthly data account for number of days in month
3337
* Option to run generations until generations no longer see improvement of at least X % over n generations
38+
* add constant to GLM
39+
40+
### Faster Convergence
41+
* Only search useful parameters, highest probability for most likely effective parameters
42+
* 'Expert' starting template to try most likley combinations first
43+
* Recombination of parameters (both transformation and model)
3444

3545
#### New Ensembles:
3646
best 3 (unique algorithms not just variations)
@@ -42,13 +52,14 @@
4252
Last Value + Drift Naive
4353
Simple Decomposition forecasting
4454
GluonTS Models
55+
Tensorflow Probability Structural Time Series
56+
Pytorch Simple LSTM/GRU
4557
Simulations
4658
XGBoost (doesn't support multioutput directly)
4759
Sklearn + TSFresh
48-
Sklearn + polynomial features
4960
Sktime
5061
Ta-lib
5162
tslearn
52-
pydlm
63+
pydlm - baysesian dynamic linear
5364
Isotonic regression
5465
TPOT if it adds multioutput functionality

autots/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010

1111
from autots.evaluator.auto_ts import AutoTS
1212

13-
__version__ = '0.0.4'
13+
__version__ = '0.1.0'
1414

1515

1616
__all__ = ['load_toy_daily','load_toy_monthly', 'load_toy_yearly', 'load_toy_hourly',

autots/evaluator/auto_model.py

Lines changed: 59 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -163,9 +163,10 @@ def ModelPrediction(df_train, forecast_length: int, transformation_dict: dict,
163163

164164
return df_forecast
165165

166-
ModelNames = ['ZeroesNaive', 'LastValueNaive', 'MedValueNaive',
167-
'GLM', 'ETS', 'ARIMA', 'FBProphet', 'RandomForestRolling']
168-
166+
ModelNames = ['ZeroesNaive', 'LastValueNaive', 'MedValueNaive', 'GLS',
167+
'GLM', 'ETS', 'ARIMA', 'FBProphet', 'RollingRegression',
168+
'UnobservedComponents', 'VARMAX', 'VECM', 'DynamicFactor']
169+
# ModelNames = ['RollingRegression']
169170
def ModelMonster(model: str, parameters: dict = {}, frequency: str = 'infer',
170171
prediction_interval: float = 0.9, holiday_country: str = 'US',
171172
startTimeStamps = None,
@@ -188,9 +189,17 @@ def ModelMonster(model: str, parameters: dict = {}, frequency: str = 'infer',
188189
from autots.models.basics import MedValueNaive
189190
return MedValueNaive(frequency = frequency, prediction_interval = prediction_interval)
190191

192+
if model == 'GLS':
193+
from autots.models.statsmodels import GLS
194+
return GLS(frequency = frequency, prediction_interval = prediction_interval)
195+
191196
if model == 'GLM':
192197
from autots.models.statsmodels import GLM
193-
return GLM(frequency = frequency, prediction_interval = prediction_interval)
198+
if parameters == {}:
199+
model = GLM(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country, random_seed = random_seed, verbose = verbose)
200+
else:
201+
model = GLM(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country, random_seed = random_seed, verbose = verbose, family = parameters['family'])
202+
return model
194203

195204
if model == 'ETS':
196205
from autots.models.statsmodels import ETS
@@ -216,15 +225,56 @@ def ModelMonster(model: str, parameters: dict = {}, frequency: str = 'infer',
216225
model = FBProphet(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country, holiday =parameters['holiday'], regression_type=parameters['regression_type'], random_seed = random_seed, verbose = verbose)
217226
return model
218227

219-
if model == 'RandomForestRolling':
220-
from autots.models.sklearn import RandomForestRolling
228+
if model == 'RollingRegression':
229+
from autots.models.sklearn import RollingRegression
230+
if parameters == {}:
231+
model = RollingRegression(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country, random_seed = random_seed, verbose = verbose)
232+
else:
233+
model = RollingRegression(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country, holiday =parameters['holiday'], regression_type=parameters['regression_type'], random_seed = random_seed, verbose = verbose,
234+
regression_model = parameters['regression_model'], mean_rolling_periods =parameters['mean_rolling_periods'], std_rolling_periods =parameters['std_rolling_periods'])
235+
return model
236+
237+
if model == 'UnobservedComponents':
238+
from autots.models.statsmodels import UnobservedComponents
239+
if parameters == {}:
240+
model = UnobservedComponents(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country, random_seed = random_seed, verbose = verbose)
241+
else:
242+
model = UnobservedComponents(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country,
243+
regression_type=parameters['regression_type'], random_seed = random_seed, verbose = verbose,
244+
level = parameters['level'], trend=parameters['trend'], cycle = parameters['cycle'],
245+
damped_cycle = parameters['damped_cycle'], irregular = parameters['irregular'],
246+
stochastic_trend=parameters['stochastic_trend'], stochastic_level=parameters['stochastic_level'],
247+
stochastic_cycle=parameters['stochastic_cycle'])
248+
return model
249+
250+
if model == 'DynamicFactor':
251+
from autots.models.statsmodels import DynamicFactor
252+
if parameters == {}:
253+
model = DynamicFactor(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country, random_seed = random_seed, verbose = verbose)
254+
else:
255+
model = DynamicFactor(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country,
256+
regression_type=parameters['regression_type'], random_seed = random_seed, verbose = verbose,
257+
k_factors = parameters['k_factors'], factor_order = parameters['factor_order'])
258+
return model
259+
260+
if model == 'VECM':
261+
from autots.models.statsmodels import VECM
221262
if parameters == {}:
222-
model = RandomForestRolling(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country, random_seed = random_seed, verbose = verbose)
263+
model = VECM(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country, random_seed = random_seed, verbose = verbose)
223264
else:
224-
model = RandomForestRolling(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country, holiday =parameters['holiday'], regression_type=parameters['regression_type'], random_seed = random_seed, verbose = verbose,
225-
n_estimators =parameters['n_estimators'], min_samples_split =parameters['min_samples_split'], max_depth =parameters['max_depth'], mean_rolling_periods =parameters['mean_rolling_periods'], std_rolling_periods =parameters['std_rolling_periods'])
265+
model = VECM(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country,
266+
regression_type=parameters['regression_type'], random_seed = random_seed, verbose = verbose,
267+
deterministic = parameters['deterministic'], k_ar_diff = parameters['k_ar_diff'])
226268
return model
227269

270+
if model == 'VARMAX':
271+
from autots.models.statsmodels import VARMAX
272+
if parameters == {}:
273+
model = VARMAX(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country, random_seed = random_seed, verbose = verbose)
274+
else:
275+
model = VARMAX(frequency = frequency, prediction_interval = prediction_interval, holiday_country = holiday_country, random_seed = random_seed, verbose = verbose,
276+
order = parameters['order'], trend = parameters['trend'])
277+
return model
228278

229279
else:
230280
raise AttributeError("Model String not found in ModelMonster")

autots/models/ensemble.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
import pandas as pd
44
import json
55
from autots.evaluator.auto_model import PredictionObject
6-
from autots.evaluator.auto_model import create_model_id
6+
from autots.evaluator.auto_model import create_model_id
77

88

99
def Best3Ensemble(ensemble_params, forecasts_list, forecasts, lower_forecasts, upper_forecasts, forecasts_runtime, prediction_interval):
@@ -123,7 +123,7 @@ def EnsembleEvaluate(ensemble_forecasts_list: list, df_test, weights, model_coun
123123
'Runs': 1
124124
}, index = [0])
125125
a = pd.DataFrame(model_error.avg_metrics_weighted.rename(lambda x: x + '_weighted')).transpose()
126-
result = pd.concat([result, pd.DataFrame(model_error.avg_metrics).transpose(), a], axis = 1)
126+
result = pd.concat([result, pd.DataFrame(model_error.avg_metrics).transpose(), a], axis = 1, sort = False)
127127

128128
ens_eval.model_results = pd.concat([ens_eval.model_results, result], axis = 0, ignore_index = True, sort = False).reset_index(drop = True)
129129
temp = pd.DataFrame(model_error.per_timestamp_metrics.loc['smape']).transpose()

0 commit comments

Comments
 (0)