Skip to content

Commit cb3a602

Browse files
authored
Merge pull request #379 from unit8co/develop
2 parents 39e0081 + 49302a8 commit cb3a602

File tree

75 files changed

+5477
-2089
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

75 files changed

+5477
-2089
lines changed

CHANGELOG.md

+29-1
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,36 @@
44
Darts is still in an early development phase and we cannot always guarantee backwards compatibility. Changes that may **break code which uses a previous release of Darts** are marked with a "🔴".
55

66
## [Unreleased](https://github.com/unit8co/darts/tree/develop)
7-
[Full Changelog](https://github.com/unit8co/darts/compare/0.8.1...develop)
7+
[Full Changelog](https://github.com/unit8co/darts/compare/0.9.0...develop)
88

9+
## [0.9.0](https://github.com/unit8co/darts/tree/0.9.0) (2021-07-09)
10+
### For users of the library:
11+
12+
**Added:**
13+
- Multiple forecasting models can now produce probabilistic forecasts by specifying a `num_samples` parameter when calling `predict()`. Stochastic forecasts are stored by utilizing the new `samples` dimension in the refactored `TimeSeries` class (see 'Changed' section). Models supporting probabilistic predictions so far are `ARIMA`, `ExponentialSmoothing`, `RNNModel` and `TCNModel`.
14+
- Introduced `LikelihoodModel` class which is used by probabilistic `TorchForecastingModel` classes in order to make predictions in the form of parametrized distributions of different types.
15+
- Added new abstract class `TorchParametricProbabilisticForecastingModel` to serve as parent class for probabilistic models.
16+
- Introduced new `FilteringModel` abstract class alongside `MovingAverage`, `KalmanFilter` and `GaussianProcessFilter` as concrete implementations.
17+
- Future covariates are now utilized by `TorchForecastingModels` when the forecasting horizon exceeds the `output_chunk_length` of the model. Before, `TorchForecastingModel` instances could only predict beyond their `output_chunk_length` if they were not trained on covariates, i.e. if they predicted all the data they need as input. This restriction has now been lifted by letting a model not only consume its own output when producing long predictions, but also utilizing the covariates known in the future, if available.
18+
- Added a new `RNNModel` class which utilizes and rnn module as both encoder and decoder. This new class natively supports the use of the most recent future covariates when making a forecast. See documentation for more details.
19+
- Introduced optional `epochs` parameter to the `TorchForecastingModel.predict()` method which, if provided, overrides the `n_epochs` attribute in that particular model instance and training session.
20+
- Added support for `TimeSeries` with a `pandas.RangeIndex` instead of just allowing `pandas.DatetimeIndex`.
21+
- `ForecastingModel.gridsearch` now makes use of parallel computation.
22+
- Introduced a new `force_reset` parameter to `TorchForecastingModel.__init__()` which, if left to False, will prevent the user from overriding model data with the same name and directory.
23+
24+
25+
**Fixed:**
26+
- Solved bug occurring when training `NBEATSModel` on a GPU.
27+
- Fixed crash when running `NBEATSModel` with `log_tensorboard=True`
28+
- Solved bug occurring when training a `TorchForecastingModel` instance with a `batch_size` bigger than the available number of training samples.
29+
- Some fixes in the documentation, including adding more details
30+
- Other minor bug fixes
31+
32+
**Changed:**
33+
- 🔴 The `TimeSeries` class has been refactored to support stochastic time series representation by adding an additional dimension to a time series, namely `samples`. A time series is now based on a 3-dimensional `xarray.DataArray` with shape `(n_timesteps, n_components, n_samples)`. This overhaul also includes a change of the constructor which is incompatible with the old one. However, factory methods have been added to create a `TimeSeries` instance from a variety of data types, including `pd.DataFrame`. Please refer to the documentation of `TimeSeries` for more information.
34+
- 🔴 The old version of `RNNModel` has been renamed to `BlockRNNModel`.
35+
- The `historical_forecast()` and `backtest()` methods of `ForecastingModel` have been reorganized a bit by making use of new wrapper methods to fit and predict models.
36+
- Updated `README.md` to reflect the new additions to the library.
937

1038
## [0.8.1](https://github.com/unit8co/darts/tree/0.8.1) (2021-05-22)
1139
**Fixed:**

README.md

+47-23
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,8 @@ It contains a variety of models, from classics such as ARIMA to deep neural netw
1616
The models can all be used in the same way, using `fit()` and `predict()` functions,
1717
similar to scikit-learn. The library also makes it easy to backtest models,
1818
and combine the predictions of several models and external regressors. Darts supports both
19-
univariate and multivariate time series and models, and the neural networks can be trained
20-
on multiple time series.
19+
univariate and multivariate time series and models. The neural networks can be trained
20+
on multiple time series, and some of the models offer probabilistic forecasts.
2121

2222
## Documentation
2323
* [Examples & Tutorials](https://unit8co.github.io/darts/examples.html)
@@ -44,29 +44,33 @@ Create a `TimeSeries` object from a Pandas DataFrame, and split it in train/vali
4444
```python
4545
import pandas as pd
4646
from darts import TimeSeries
47+
48+
# Read a pandas DataFrame
4749
df = pd.read_csv('AirPassengers.csv', delimiter=",")
50+
51+
# Create a TimeSeries, specifying the time and value columns
4852
series = TimeSeries.from_dataframe(df, 'Month', '#Passengers')
49-
train, val = series.split_after(pd.Timestamp('19580101'))
50-
```
5153

52-
Fit an exponential smoothing model, and make a prediction over the validation series' duration:
54+
# Set aside the last 36 months as a validation series
55+
train, val = series[:-36], series[-36:]
56+
```
5357

58+
Fit an exponential smoothing model, and make a (probabilistic) prediction over the validation series' duration:
5459
```python
5560
from darts.models import ExponentialSmoothing
5661

5762
model = ExponentialSmoothing()
5863
model.fit(train)
59-
prediction = model.predict(len(val))
64+
prediction = model.predict(len(val), num_samples=1000)
6065
```
6166

62-
Plot:
67+
Plot the median, 5th and 95th percentiles:
6368
```python
6469
import matplotlib.pyplot as plt
6570

66-
series.plot(label='actual')
67-
prediction.plot(label='forecast', lw=2)
71+
series.plot()
72+
prediction.plot(label='forecast', low_quantile=0.05, high_quantile=0.95)
6873
plt.legend()
69-
plt.xlabel('Year')
7074
```
7175

7276
<div style="text-align:center;">
@@ -81,17 +85,8 @@ the [examples](https://github.com/unit8co/darts/tree/master/examples) directory.
8185

8286
Currently, the library contains the following features:
8387

84-
**Forecasting Models:**
85-
86-
* Exponential smoothing,
87-
* ARIMA & auto-ARIMA,
88-
* Facebook Prophet,
89-
* Theta method,
90-
* FFT (Fast Fourier Transform),
91-
* Recurrent neural networks (vanilla RNNs, GRU, and LSTM variants),
92-
* Temporal convolutional network.
93-
* Transformer
94-
* N-BEATS
88+
**Forecasting Models:** A large collection of forecasting models; from statistical models (such as
89+
ARIMA) to deep learning models (such as N-BEATS). See table of models below.
9590

9691
**Data processing:** Tools to easily apply (and revert) common transformations on time series data (scaling, boxcox, …)
9792

@@ -100,11 +95,40 @@ from R2-scores to Mean Absolute Scaled Error.
10095

10196
**Backtesting:** Utilities for simulating historical forecasts, using moving time windows.
10297

103-
**Regressive Models:** Possibility to predict a time series from several other time series
104-
(e.g., external regressors), using arbitrary regressive models
98+
**Regressive Models:** Possibility to predict a time series from lagged versions of itself
99+
and of some external covariate series, using arbitrary regression models (e.g. scikit-learn models)
105100

106101
**Multivariate Support:** Tools to create, manipulate and forecast multivariate time series.
107102

103+
**Probabilistic Support:** `TimeSeries` objects can (optionally) represent stochastic
104+
time series; this can for instance be used to get confidence intervals.
105+
106+
**Filtering Models:** Darts offers three filtering models: `KalmanFilter`, `GaussianProcessFilter`,
107+
and `MovingAverage`, which allow to filter time series, and in some cases obtain probabilistic
108+
inferences of the underlying states/values.
109+
110+
## Forecasting Models
111+
Here's a breakdown of the forecasting models currently implemented in Darts. We are constantly working
112+
on bringing more models and features.
113+
114+
Model | Univariate | Multivariate | Probabilistic | Multiple-series training | Past-observed covariates support | Future-known covariates support
115+
--- | --- | --- | --- | --- | --- | --- |
116+
`ARIMA` | x | | x | | | |
117+
`VARIMA` | x | x | | | | |
118+
`AutoARIMA` | x | | | | | |
119+
`ExponentialSmoothing` | x | | x | | | |
120+
`Theta` and `FourTheta` | x | | | | | |
121+
`Prophet` | x | | | | | |
122+
`FFT` (Fast Fourier Transform) | x | | | | | |
123+
Regression Models (incl `RandomForest` and `LinearRegressionModel`) | x | | | | | |
124+
`RNNModel` (incl. LSTM and GRU); equivalent to DeepAR in its probabilistic version | x | x | x | x | x | x |
125+
`BlockRNNModel` (incl. LSTM and GRU) | x | x | | x | x | (x) |
126+
`NBEATSModel` | x | x | | x | x | (x) |
127+
`TCNModel` | x | x | x | x | x | (x) |
128+
`TransformerModel` | x | x | | x | x | (x) |
129+
Naive Baselines | x | | | | | |
130+
131+
108132
## Contribute
109133

110134
The development is ongoing, and there are many new features that we want to add.

darts/dataprocessing/transformers/boxcox.py

+8-6
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@
1515

1616
logger = get_logger(__name__)
1717

18+
# TODO: extend to stochastic series
19+
1820

1921
class BoxCox(FittableDataTransformer, InvertibleDataTransformer):
2022

@@ -28,7 +30,7 @@ def __init__(self,
2830
verbose: bool = False):
2931
"""
3032
Box-Cox data transformer.
31-
See https://otexts.com/fpp2/transformations.html#mathematical-transformations for more information
33+
See https://otexts.com/fpp2/transformations.html#mathematical-transformations for more information.
3234
3335
Parameters
3436
----------
@@ -94,7 +96,7 @@ def ts_fit(series: TimeSeries,
9496
if lmbda is None:
9597
# Compute optimal lmbda for each dimension of the time series. In this case, the return type is
9698
# a pd.core.series.Series, which is not inhering from collections.abs.Sequence
97-
lmbda = series._df.apply(boxcox_normmax, method=method)
99+
lmbda = series.pd_dataframe(copy=False).apply(boxcox_normmax, method=method)
98100
elif isinstance(lmbda, Sequence):
99101
raise_if(len(lmbda) != series.width,
100102
"lmbda should have one value per dimension (ie. column or variable) of the time series",
@@ -109,19 +111,19 @@ def ts_fit(series: TimeSeries,
109111
def ts_transform(series: TimeSeries, lmbda: Union[Sequence[float], pd.core.series.Series]) -> TimeSeries:
110112

111113
def _boxcox_wrapper(col):
112-
idx = series._df.columns.get_loc(col.name) # get index from col name
114+
idx = series.pd_dataframe(copy=False).columns.get_loc(col.name) # get index from col name
113115
return boxcox(col, lmbda[idx])
114116

115-
return TimeSeries.from_dataframe(series._df.apply(_boxcox_wrapper))
117+
return TimeSeries.from_dataframe(series.pd_dataframe(copy=False).apply(_boxcox_wrapper))
116118

117119
@staticmethod
118120
def ts_inverse_transform(series: TimeSeries, lmbda: Union[Sequence[float], pd.core.series.Series]) -> TimeSeries:
119121

120122
def _inv_boxcox_wrapper(col):
121-
idx = series._df.columns.get_loc(col.name) # get index from col name
123+
idx = series.pd_dataframe(copy=False).columns.get_loc(col.name) # get index from col name
122124
return inv_boxcox(col, lmbda[idx])
123125

124-
return TimeSeries.from_dataframe(series._df.apply(_inv_boxcox_wrapper))
126+
return TimeSeries.from_dataframe(series.pd_dataframe(copy=False).apply(_inv_boxcox_wrapper))
125127

126128
def fit(self, series: Union[TimeSeries, Sequence[TimeSeries]]) -> 'FittableDataTransformer':
127129
# adding lmbda and optim_method params

darts/dataprocessing/transformers/scaler.py

+7-7
Original file line numberDiff line numberDiff line change
@@ -59,17 +59,17 @@ def __init__(self,
5959

6060
@staticmethod
6161
def ts_transform(series: TimeSeries, transformer) -> TimeSeries:
62-
return TimeSeries.from_times_and_values(series.time_index(),
63-
transformer.transform(series.values().
64-
reshape((-1, series.width))),
65-
series.freq())
62+
return TimeSeries.from_times_and_values(times=series.time_index,
63+
values=transformer.transform(series.values().
64+
reshape((-1, series.width))),
65+
fill_missing_dates=False)
6666

6767
@staticmethod
6868
def ts_inverse_transform(series: TimeSeries, transformer, *args, **kwargs) -> TimeSeries:
69-
return TimeSeries.from_times_and_values(series.time_index(),
70-
transformer.inverse_transform(series.values().
69+
return TimeSeries.from_times_and_values(times=series.time_index,
70+
values=transformer.inverse_transform(series.values().
7171
reshape((-1, series.width))),
72-
series.freq())
72+
fill_missing_dates=False)
7373

7474
@staticmethod
7575
def ts_fit(series: TimeSeries, transformer, *args, **kwargs) -> Any:

darts/datasets/dataset_loaders.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -145,5 +145,5 @@ def _load_from_disk(self, path_to_file: Path, metadata: DatasetLoaderMetadata) -
145145
df = pd.read_csv(path_to_file)
146146
if metadata.header_time is not None:
147147
df = self._format_time_column(df)
148-
return TimeSeries.from_dataframe(df, metadata.header_time)
149-
return TimeSeries(df, dummy_index=True)
148+
return TimeSeries.from_dataframe(df=df, time_col=metadata.header_time)
149+
return TimeSeries.from_dataframe(df)

0 commit comments

Comments
 (0)