Skip to content

Commit baed432

Browse files
authored
Merge pull request #71 from winedarksea/dev
0.3.2
2 parents ec48749 + a10ade6 commit baed432

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

59 files changed

+1293
-333
lines changed

.github/workflows/codeql-analysis.yml

Lines changed: 31 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,38 +1,51 @@
1-
name: "Code scanning - action"
1+
# For most projects, this workflow file will not need changing; you simply need
2+
# to commit it to your repository.
3+
#
4+
# You may wish to alter this file to override the set of languages analyzed,
5+
# or to provide custom queries or build logic.
6+
#
7+
# ******** NOTE ********
8+
# We have attempted to detect the languages in your repository. Please check
9+
# the `language` matrix defined below to confirm you have the correct set of
10+
# supported CodeQL languages.
11+
#
12+
name: "CodeQL"
213

314
on:
415
push:
5-
branches: [master, ]
16+
branches: [master]
617
pull_request:
718
# The branches below must be a subset of the branches above
8-
branches: [master]
19+
branches: [master, dev]
920
schedule:
10-
- cron: '0 6 * * 1'
21+
- cron: '23 1 * * 4'
1122

1223
jobs:
13-
CodeQL-Build:
14-
24+
analyze:
25+
name: Analyze
1526
runs-on: ubuntu-latest
1627

28+
strategy:
29+
fail-fast: false
30+
matrix:
31+
language: [ 'python' ]
32+
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python' ]
33+
# Learn more:
34+
# https://docs.github.com/en/free-pro-team@latest/github/finding-security-vulnerabilities-and-errors-in-your-code/configuring-code-scanning#changing-the-languages-that-are-analyzed
35+
1736
steps:
1837
- name: Checkout repository
1938
uses: actions/checkout@v2
20-
with:
21-
# We must fetch at least the immediate parents so that if this is
22-
# a pull request then we can checkout the head.
23-
fetch-depth: 2
24-
25-
# If this run was triggered by a pull request event, then checkout
26-
# the head of the pull request instead of the merge commit.
27-
- run: git checkout HEAD^2
28-
if: ${{ github.event_name == 'pull_request' }}
2939

3040
# Initializes the CodeQL tools for scanning.
3141
- name: Initialize CodeQL
3242
uses: github/codeql-action/init@v1
33-
# Override language selection by uncommenting this and choosing your languages
34-
# with:
35-
# languages: go, javascript, csharp, python, cpp, java
43+
with:
44+
languages: ${{ matrix.language }}
45+
# If you wish to specify custom queries, you can do so here or in a config file.
46+
# By default, queries listed here will override any specified in a config file.
47+
# Prefix the list here with "+" to use these queries and those in the config file.
48+
# queries: ./path/to/local/query, your-org/your-repo/queries@main
3649

3750
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
3851
# If this step fails, then you should remove it and run the build manually (see below)

README.md

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,14 @@ AutoML for forecasting with open-source time series implementations.
88

99
For other time series needs, check out the list [here](https://github.com/MaxBenChrist/awesome_time_series_in_python).
1010

11+
## Table of Contents
12+
* [Features](https://github.com/winedarksea/AutoTS#features)
13+
* [Installation](https://github.com/winedarksea/AutoTS#installation)
14+
* [Basic Use](https://github.com/winedarksea/AutoTS#basic-use)
15+
* [Tips for Speed and Large Data](https://github.com/winedarksea/AutoTS#tips-for-speed-and-large-data)
16+
* Extended Tutorial [GitHub](https://github.com/winedarksea/AutoTS/blob/master/extended_tutorial.md) or [Docs](https://winedarksea.github.io/AutoTS/build/html/source/tutorial.html)
17+
* [Production Example](https://github.com/winedarksea/AutoTS/blob/master/production_example.py)
18+
1119
## Features
1220
* Finds optimal time series forecasting model and data transformations by genetic programming optimization
1321
* Handles univariate and multivariate/parallel time series
@@ -31,7 +39,7 @@ For other time series needs, check out the list [here](https://github.com/MaxBen
3139
```
3240
pip install autots
3341
```
34-
This includes dependencies for basic models, but additonal packages are required for some models and methods.
42+
This includes dependencies for basic models, but [additonal packages](https://github.com/winedarksea/AutoTS/blob/master/extended_tutorial.md#installation-and-dependency-versioning) are required for some models and methods.
3543

3644
## Basic Use
3745

@@ -91,11 +99,13 @@ The lower-level API, in particular the large section of time series transformers
9199

92100
Check out [extended_tutorial.md](https://winedarksea.github.io/AutoTS/build/html/source/tutorial.html) for a more detailed guide to features!
93101

102+
Also take a look at the [production_example.py](https://github.com/winedarksea/AutoTS/blob/master/production_example.py)
103+
94104

95105
## Tips for Speed and Large Data:
96106
* Use appropriate model lists, especially the predefined lists:
97107
* `superfast` (simple naive models) and `fast` (more complex but still faster models)
98-
* `fast_parallel` (a combination of `fast` and `parallel`) or `parallel`, given mave many CPU cores are available
108+
* `fast_parallel` (a combination of `fast` and `parallel`) or `parallel`, given many CPU cores are available
99109
* `n_jobs` usually gets pretty close with `='auto'` but adjust as necessary for the environment
100110
* see a dict of predefined lists (some defined for internal use) with `from autots.models.model_list import model_lists`
101111
* Use the `subset` parameter when there are many similar series, `subset=100` will often generalize well for tens of thousands of similar series.

TODO.md

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -15,18 +15,16 @@
1515
* Forecasts are desired for the future immediately following the most recent data.
1616

1717
# Latest
18-
* Additional models to GluonTS
19-
* GeneralTransformer transformation_params - now handle None or empty dict
20-
* cleaning up of the appropriately named 'ModelMonster'
21-
* improving MotifSimulation
22-
* better error message for all models
23-
* enable histgradientboost regressor, left it out before thinking it wouldn't stay experimental this long
24-
* import_template now has slightly better `method` input style
25-
* allow `ensemble` parameter to be a list
26-
* NumericTransformer
27-
* add .fit_transform method
28-
* generally more options and speed improvement
29-
* added NumericTransformer to future_regressors, should now coerce if they have different dtypes
18+
* Table of Contents to Extended Tutorial/Readme.md
19+
* Production Example
20+
* add weights="mean"/median/min/max
21+
* UnivariateRegression
22+
* fix check_pickle error for ETS
23+
* fix error in Prophet with latest version
24+
* VisibleDeprecation warning for hidden_layers random choice in sklearn fixed
25+
* prefill_na option added to allow quick filling of NaNs if desired (with zeroes for say, sales forecasting)
26+
* made horizontal generalization more stable
27+
* fixed bug in VAR where failing on data with negatives
3028

3129
# Known Errors:
3230
DynamicFactor holidays Exceptions 'numpy.ndarray' object has no attribute 'values'

autots/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
from autots.tools.transform import GeneralTransformer, RandomTransform
1717
from autots.tools.shaping import long_to_wide
1818

19-
__version__ = '0.3.1'
19+
__version__ = '0.3.2'
2020

2121
TransformTS = GeneralTransformer
2222

autots/datasets/fred.py

Lines changed: 37 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -14,23 +14,23 @@
1414
_has_fred = True
1515

1616

17-
def get_fred_data(fredkey: str, SeriesNameDict: dict = {'SeriesID': 'SeriesName'}):
18-
"""
19-
Imports Data from Federal Reserve
17+
def get_fred_data(fredkey: str, SeriesNameDict: dict = None, long=True, **kwargs):
18+
"""Imports Data from Federal Reserve.
19+
For simplest results, make sure requested series are all of the same frequency.
2020
2121
args:
22-
fredkey - an API key from FRED
23-
24-
SeriesNameDict, pairs of FRED Series IDs and Series Names
22+
fredkey (str): an API key from FRED
23+
SeriesNameDict (dict): pairs of FRED Series IDs and Series Names like: {'SeriesID': 'SeriesName'} or a list of FRED IDs.
2524
Series id must match Fred IDs, but name can be anything
26-
if default is use, several default samples are returned
25+
if None, several default series are returned
26+
long (bool): if True, return long style data, else return wide style data with dt index
2727
"""
2828
if not _has_fred:
2929
raise ImportError("Package fredapi is required")
3030

3131
fred = Fred(api_key=fredkey)
3232

33-
if SeriesNameDict == {'SeriesID': 'SeriesName'}:
33+
if SeriesNameDict is None:
3434
SeriesNameDict = {
3535
'T10Y2Y': '10 Year Treasury Constant Maturity Minus 2 Year Treasury Constant Maturity',
3636
'DGS10': '10 Year Treasury Constant Maturity Rate',
@@ -44,29 +44,42 @@ def get_fred_data(fredkey: str, SeriesNameDict: dict = {'SeriesID': 'SeriesName'
4444
'USEPUINDXD': 'Economic Policy Uncertainty Index for United States', # also very irregular
4545
}
4646

47-
series_desired = list(SeriesNameDict.keys())
47+
if isinstance(SeriesNameDict, dict):
48+
series_desired = list(SeriesNameDict.keys())
49+
else:
50+
series_desired = list(SeriesNameDict)
4851

49-
fred_timeseries = pd.DataFrame(
50-
columns=['date', 'value', 'series_id', 'series_name']
51-
)
52+
if long:
53+
fred_timeseries = pd.DataFrame(
54+
columns=['date', 'value', 'series_id', 'series_name']
55+
)
56+
else:
57+
fred_timeseries = pd.DataFrame()
5258

5359
for series in series_desired:
5460
data = fred.get_series(series)
5561
try:
5662
series_name = SeriesNameDict[series]
5763
except Exception:
5864
series_name = series
59-
data_df = pd.DataFrame(
60-
{
61-
'date': data.index,
62-
'value': data,
63-
'series_id': series,
64-
'series_name': series_name,
65-
}
66-
)
67-
data_df.reset_index(drop=True, inplace=True)
68-
fred_timeseries = pd.concat(
69-
[fred_timeseries, data_df], axis=0, ignore_index=True
70-
)
65+
66+
if long:
67+
data_df = pd.DataFrame(
68+
{
69+
'date': data.index,
70+
'value': data,
71+
'series_id': series,
72+
'series_name': series_name,
73+
}
74+
)
75+
data_df.reset_index(drop=True, inplace=True)
76+
fred_timeseries = pd.concat(
77+
[fred_timeseries, data_df], axis=0, ignore_index=True
78+
)
79+
else:
80+
data.name = series_name
81+
fred_timeseries = fred_timeseries.merge(
82+
data, how="outer", left_index=True, right_index=True
83+
)
7184

7285
return fred_timeseries

autots/evaluator/auto_model.py

Lines changed: 46 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,12 @@
88
from autots.evaluator.metrics import PredictionEval
99
from autots.tools.transform import RandomTransform, GeneralTransformer, shared_trans
1010
from autots.models.ensemble import EnsembleForecast, generalize_horizontal
11-
from autots.models.model_list import no_params, recombination_approved, no_shared
11+
from autots.models.model_list import (
12+
no_params,
13+
recombination_approved,
14+
no_shared,
15+
superfast,
16+
)
1217
from itertools import zip_longest
1318
from autots.models.basics import (
1419
MotifSimulation,
@@ -146,6 +151,20 @@ def ModelMonster(
146151
**parameters,
147152
)
148153
return model
154+
elif model == 'UnivariateRegression':
155+
from autots.models.sklearn import UnivariateRegression
156+
157+
model = UnivariateRegression(
158+
frequency=frequency,
159+
prediction_interval=prediction_interval,
160+
holiday_country=holiday_country,
161+
random_seed=random_seed,
162+
verbose=verbose,
163+
n_jobs=n_jobs,
164+
forecast_length=forecast_length,
165+
**parameters,
166+
)
167+
return model
149168

150169
elif model == 'UnobservedComponents':
151170
model = UnobservedComponents(
@@ -658,6 +677,7 @@ def PredictWitch(
658677
if isinstance(template, pd.Series):
659678
template = pd.DataFrame(template).transpose()
660679
template = template.head(1)
680+
full_model_created = False # make at least one full model, horziontal only
661681
for index_upper, row_upper in template.iterrows():
662682
# if an ensemble
663683
if row_upper['Model'] == 'Ensemble':
@@ -750,18 +770,25 @@ def PredictWitch(
750770
model_str = row_upper['Model']
751771
parameter_dict = json.loads(row_upper['ModelParameters'])
752772
transformation_dict = json.loads(row_upper['TransformationParameters'])
773+
# this is needed for horizontal generalization if any models failed, at least one full model on all series
774+
if model_str in superfast and not full_model_created:
775+
make_full_flag = True
776+
else:
777+
make_full_flag = False
753778
if (
754779
horizontal_subset is not None
755780
and model_str in no_shared
756781
and all(
757782
trs not in shared_trans
758783
for trs in list(transformation_dict['transformations'].values())
759784
)
785+
and not make_full_flag
760786
):
761787
df_train_low = df_train.reindex(copy=True, columns=horizontal_subset)
762788
# print(f"Reducing to subset for {model_str} with {df_train_low.columns}")
763789
else:
764790
df_train_low = df_train.copy()
791+
full_model_created = True
765792

766793
df_forecast = ModelPrediction(
767794
df_train_low,
@@ -816,6 +843,7 @@ def TemplateWizard(
816843
'TransformationParameters',
817844
'Ensemble',
818845
],
846+
traceback: bool = False,
819847
):
820848
"""
821849
Take Template, returns Results.
@@ -844,6 +872,7 @@ def TemplateWizard(
844872
max_generations (int): info to pass to print statements
845873
model_interrupt (bool): if True, keyboard interrupts are caught and only break current model eval.
846874
template_cols (list): column names of columns used as model template
875+
traceback (bool): include tracebook over just error representation
847876
848877
Returns:
849878
TemplateEvalObject
@@ -1030,11 +1059,23 @@ def TemplateWizard(
10301059
raise KeyboardInterrupt
10311060
except Exception as e:
10321061
if verbose >= 0:
1033-
print(
1034-
'Template Eval Error: {} in model {}: {}'.format(
1035-
(repr(e)), template_result.model_count, model_str
1062+
if traceback:
1063+
import traceback as tb
1064+
1065+
print(
1066+
'Template Eval Error: {} in model {}: {}'.format(
1067+
''.join(tb.format_exception(None, e, e.__traceback__)),
1068+
template_result.model_count,
1069+
model_str,
1070+
)
10361071
)
1037-
)
1072+
else:
1073+
print(
1074+
'Template Eval Error: {} in model {}: {}'.format(
1075+
(repr(e)), template_result.model_count, model_str
1076+
)
1077+
)
1078+
10381079
result = pd.DataFrame(
10391080
{
10401081
'ID': create_model_id(

0 commit comments

Comments
 (0)