-
Notifications
You must be signed in to change notification settings - Fork 209
[ENH] add a difference transformer to series transformations #2729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 4 commits
8c07ea3
321fdc2
8f78ec9
8b79897
606110c
c097579
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
import numpy as np | ||
|
||
from aeon.transformations.series.base import BaseSeriesTransformer | ||
|
||
__maintainer__ = ["Tina Jin"] | ||
__all__ = ["DifferenceTransformer"] | ||
|
||
|
||
class DifferenceTransformer(BaseSeriesTransformer): | ||
""" | ||
Calculates the n-th order difference of a time series. | ||
|
||
Transforms a time series X into a series Y representing the difference | ||
calculated `order` times. | ||
- Order 1: Y[t] = X[t] - X[t-1] | ||
- Order 2: Y[t] = (X[t] - X[t-1]) - (X[t-1] - X[t-2]) = X[t] - 2*X[t-1] + X[t-2] | ||
- ... and so on. | ||
|
||
The first `order` element(s) of the transformed series along the time axis | ||
will be NaN, so that the output series will have the same shape as the input series. | ||
|
||
Parameters | ||
---------- | ||
order : int, default=1 | ||
The order of differencing. Must be a positive integer. | ||
|
||
axis : int, default=1 | ||
The axis along which the difference is computed. Assumed to be the | ||
time axis. | ||
If `axis == 0`, assumes shape `(n_timepoints, n_channels)`. | ||
If `axis == 1`, assumes shape `(n_channels, n_timepoints)`. | ||
|
||
Notes | ||
----- | ||
This transformer assumes the input series does not contain NaN values where | ||
the difference needs to be computed. | ||
|
||
Examples | ||
-------- | ||
>>> import numpy as np | ||
>>> from aeon.transformations.series._diff import DifferenceTransformer | ||
>>> X1 = np.array([[1, 3, 2, 5, 4, 7, 6, 9, 8, 10]]) | ||
>>> dt = DifferenceTransformer() | ||
>>> Xt1 = dt.fit_transform(X1) | ||
>>> print(Xt1) | ||
[[nan 2. -1. 3. -1. 3. -1. 3. -1. 2.]] | ||
|
||
>>> X2 = np.array([[1, 3, 2, 5, 4, 7, 6, 9, 8, 10]]) | ||
>>> dt2 = DifferenceTransformer(order=2) | ||
>>> Xt2 = dt2.fit_transform(X2) | ||
>>> print(Xt2) | ||
[[nan nan -3. 4. -4. 4. -4. 4. -4. 3.]] | ||
|
||
>>> X3 = np.array([[1, 2, 3, 4, 5], [5, 4, 3, 2, 1]]) | ||
>>> dt = DifferenceTransformer() | ||
>>> Xt3 = dt.fit_transform(X3) | ||
>>> print(Xt3) | ||
[[nan 1. 1. 1. 1.] | ||
[nan -1. -1. -1. -1.]] | ||
|
||
>>> X4 = np.array([[1, 5], [2, 4], [3, 3], [4, 2], [5, 1]]) | ||
>>> dt_axis0 = DifferenceTransformer(axis=0) | ||
>>> Xt4 = dt_axis0.fit_transform(X4, axis=0) | ||
>>> print(Xt4) | ||
[[nan nan] | ||
[ 1. -1.] | ||
[ 1. -1.] | ||
[ 1. -1.] | ||
[ 1. -1.]] | ||
""" | ||
|
||
_tags = { | ||
"capability:multivariate": True, | ||
"X_inner_type": "np.ndarray", | ||
"fit_is_empty": True, | ||
} | ||
|
||
def __init__(self, order=1, axis=1): | ||
if not isinstance(order, int) or order < 1: | ||
raise ValueError(f"`order` must be a positive integer, but got {order}") | ||
self.order = order | ||
super().__init__(axis=axis) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The "axis" is inherited from BaseSeriesTransformer. Should "axis = 1" be used to indicate that the time series are all in rows, with shape (n_channels, n_timepoints)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, thats telling the base class to convert the series to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. got it |
||
|
||
def _transform(self, X, y=None): | ||
""" | ||
Perform the n-th order differencing transformation. | ||
|
||
Parameters | ||
---------- | ||
X : np.ndarray | ||
|
||
y : ignored argument for interface compatibility | ||
|
||
Returns | ||
------- | ||
Xt : np.ndarray | ||
Transformed version of X with the same shape, containing the | ||
n-th order difference. | ||
The first `order` elements along the time axis are NaN. | ||
""" | ||
diff_X = np.diff(X, n=self.order, axis=self.axis) | ||
|
||
# Check if diff_X is integer type. | ||
# If so, cast to float to allow inserting np.nan. | ||
if not np.issubdtype(diff_X.dtype, np.floating): | ||
diff_X = diff_X.astype(np.float64) | ||
|
||
# Insert the NaN at the beginning | ||
nan_shape = list(X.shape) | ||
nan_shape[self.axis] = self.order | ||
nans_to_prepend = np.full(nan_shape, np.nan, dtype=np.float64) | ||
|
||
Xt = np.concatenate([nans_to_prepend, diff_X], axis=self.axis) | ||
|
||
return Xt |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
"""Tests for Difference transformation.""" | ||
|
||
import numpy as np | ||
|
||
from aeon.transformations.series._diff import DifferenceTransformer | ||
|
||
|
||
def test_diff(): | ||
"""Tests basic first and second order differencing.""" | ||
X = np.array([[1.0, 4.0, 9.0, 16.0, 25.0, 36.0]]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you also test multivariate series in another test perhaps |
||
|
||
dt1 = DifferenceTransformer(order=1) | ||
Xt1 = dt1.fit_transform(X) | ||
expected1 = np.array([[np.nan, 3.0, 5.0, 7.0, 9.0, 11.0]]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO better to return a smaller series than include NaNs. Possibly a parameter if you think it is worth it but by default change the shape. |
||
|
||
assert Xt1.shape == X.shape, "Shape mismatch for order 1" | ||
np.testing.assert_allclose( | ||
Xt1, expected1, equal_nan=True, err_msg="Value mismatch for order 1" | ||
) | ||
|
||
dt2 = DifferenceTransformer(order=2) | ||
Xt2 = dt2.fit_transform(X) | ||
expected2 = np.array([[np.nan, np.nan, 2.0, 2.0, 2.0, 2.0]]) | ||
|
||
assert Xt2.shape == X.shape, "Shape mismatch for order 2" | ||
np.testing.assert_allclose( | ||
Xt2, expected2, equal_nan=True, err_msg="Value mismatch for order 2" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would do this in
fit