Skip to content

[ENH] ShapeletTransform: binary ig calculation problem #1322

@zjeqw

Description

@zjeqw

Describe the bug

The current _calc_binary_ig( ) evaluates split points between data points with the same feature values but different labels, which might not be suitable for datasets that contain a lot of such data points.

Steps/Code to reproduce the bug

from aeon.transformations.collection.shapelet_based._shapelet_transform import _calc_binary_ig
orderline = [(2,-1),(2,-1),(2,1),(3,1),(3,1)]
c1, c2 = 3, 2
_calc_binary_ig(orderline,c1,c2)

Expected results

0.42

Actual results

0.97

Versions

System:
python: 3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64 bit (AMD64)]
executable: c:\xxx\python.exe
machine: Windows-10-10.0.19041-SP0

Python dependencies:
pip: 22.3.1
setuptools: 57.4.0
scikit-learn: 1.4.0
aeon: 0.7.1
statsmodels: None
numpy: 1.24.0
scipy: 1.10.1
pandas: 2.0.3
matplotlib: 3.5.0
joblib: 1.3.2
numba: 0.58.1
pmdarima: None
tsfresh: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature, improvement request or other non-bug code enhancementtransformationsTransformations package

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions