Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 1 addition & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ Note that individual operations can only be called directly on individual time-s
Time-series feature extraction is computationally intensive.
To speed up processing, pyhctsa allows you to distribute the workload across multiple CPU cores on your local machine using the `LocalDistributor`:
```Python
from pyhctsa.distributed import LocalDistributor
from pyhctsa.distribute import LocalDistributor
from pyhctsa.calculator import FeatureCalculator

# initialize the calculator
Expand All @@ -94,21 +94,6 @@ dist = LocalDistributor(n_workers=4)
res = calc.extract(data, distributor=dist)
```

## ℹ️ Note for Windows users
Some features require Java (JDK) to be installed. If you encounter a `JVM not found` error:

1. Ensure Java Development Kit (JDK) is installed on your system
- Download from [Oracle](https://www.oracle.com/java/technologies/downloads/) or use OpenJDK
- Minimum version required: JDK 11

2. Before importing pyhctsa, set the `JAVA_HOME` environment variable using the location of the JDK installation on your system:
```Python
import os
os.environ['JAVA_HOME'] = "C:\Program Files\Java\jdk-11" # replace with relevant path
from pyhctsa.calculator import FeatureCalculator
# rest of your code...
```

# 🔑 Licenses

## Internal licenses
Expand All @@ -119,7 +104,6 @@ While the majority of features in _pyhctsa_ rely on standard Python libraries, a

The following external time-series analysis code packages are provided with the software (in the `toolboxes` directory), and are used by our main feature-extraction calculator to compute meaningful structural features from time series:

- Joseph T. Lizier's [Java Information Dynamics Toolkit (JIDT)](https://github.com/jlizier/jidt) for studying information-theoretic measures of computation in complex systems, version 1.3 (GPL license).
- Time-series analysis code developed by [Michael Small](https://github.com/m-small) (unlicensed).
- Max Little's [time-series analysis code](http://www.maxlittle.net/software/index.php) (GPL License).
- [TISEAN package for nonlinear time-series analysis](http://www.mpipks-dresden.mpg.de/~tisean/Tisean_3.0.1/index.html), version 3.0.1 (GPL license).
Expand Down
16 changes: 2 additions & 14 deletions docs/source/usage/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ cores on your local machine using the `LocalDistributor`:

.. code-block:: python

from pyhctsa.distributed import LocalDistributor
from pyhctsa.distribute import LocalDistributor
from pyhctsa.calculator import FeatureCalculator

# initialize the calculator
Expand All @@ -66,16 +66,4 @@ cores on your local machine using the `LocalDistributor`:
# pass the distributor to the .extract() method
res = calc.extract(data, distributor=dist)

ℹ️ Note for Windows Users
-------------------------
Some features require Java (JDK) to be installed. If you encounter a JVM not found error:
1. Ensure Java Development Kit (JDK) is installed on your system
- Download from Oracle or use OpenJDK (Minimum version required: JDK 11)
2. Before importing `pyhctsa`, set the `JAVA_HOME` environment variable using the location of the JDK installation on your system:

.. code-block:: python

import os
os.environ['JAVA_HOME'] = "C:\Program Files\Java\jdk-11" # replace with relevant path
from pyhctsa.calculator import FeatureCalculator
# rest of your code...

27 changes: 15 additions & 12 deletions pyhctsa/calculator.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@
from typing import Union, Any, Callable
import logging

logger = logging.getLogger('pyhctsa')
logger.setLevel(logging.CRITICAL) # only log critical warnings by default
logger.addHandler(logging.NullHandler())

import numpy as np
import pandas as pd
import yaml
Expand Down Expand Up @@ -70,7 +74,7 @@ def wrapper(*args, **kwargs):
if isinstance(result, dict):
missing = [k for k in keys if k not in result] # log all of the missing keys
if missing:
logging.info(f'Warning: time-series features for func {func} not found {missing}')
logger.info(f'Warning: time-series features for func {func} not found {missing}')
if keep:
return {k: result[k] for k in keys if k in result}
else:
Expand All @@ -87,7 +91,7 @@ def _standardise_inputs(data) -> list[np.ndarray]:
elif data.ndim == 2:
if data.shape[0] > data.shape[1]:
# notify the user to check that the shapes make sense
logging.warning(f"Check that the shape of the 2D input is such "
logger.warning(f"Check that the shape of the 2D input is such "
f"that (n_series, n_samples). Got shape: {data.shape}")
return [np.asarray(row, dtype=float) for row in data]
else:
Expand Down Expand Up @@ -209,34 +213,36 @@ def _repr_html_(self):
return _build_repr_html(self.feature_funcs, self._skipped_functions, self.config, self.config_path)

def _check_deps(self, module_key, feature_name, config):
raw_deps = config.get("dependencies")
raw_deps = config.get("dependencies", None)
if not raw_deps:
return True
deps_to_check = [raw_deps] if isinstance(raw_deps, str) else raw_deps
missing = [dep for dep in deps_to_check if not _check_optional_deps(dep)]
if missing:
full_name = f"{module_key}.{feature_name}"
logging.info(f"Skipping function '{full_name}' - missing dependencies: {', '.join(missing)}")
logger.info(f"Skipping function '{full_name}' - missing dependencies: {', '.join(missing)}")
self._skipped_functions.append((full_name, missing))
return False
return True

def _build_feature_funcs(self):
feature_funcs = {}
skipped_functions = []
self._skipped_functions = []
for module_key in self.config.keys():

try:
module = importlib.import_module(f"{self._operations_package}.{module_key}")
except ImportError as e:
logging.warning(f"Failed to import module '{module_key}': {e}")
logger.warning(f"Failed to import module '{module_key}': {e}")
# Skip all functions in this module since we can't import it
for feature_name in self.config[module_key].keys():
skipped_functions.append((f"{module_key}.{feature_name}", ["import_error"]))
self._skipped_functions.append((f"{module_key}.{feature_name}", ["import_error"]))
continue

# Process features from this module
for feature_name, feature_config in self.config[module_key].items():
if not self._check_deps(module_key, feature_name, feature_config):
continue
op_func = getattr(module, feature_name)
base_name = feature_config.get("base_name", feature_name)
ordered_args = feature_config.get("ordered_args", [])
Expand Down Expand Up @@ -270,11 +276,8 @@ def _build_feature_funcs(self):

feature_funcs[label] = final_func

# store information about skipped functions for later reference
self._skipped_functions = skipped_functions
if skipped_functions:
logging.info(f"Total functions skipped due to missing dependencies: {len(skipped_functions)}")

if self._skipped_functions:
logger.info(f"Total functions skipped due to missing dependencies: {len(self._skipped_functions)}")
return feature_funcs

def extract(self, data: Union[ArrayLike, list[ArrayLike]],
Expand Down
20 changes: 8 additions & 12 deletions pyhctsa/configurations/hctsa.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,6 @@ correlation:
add_noise:
base_name: add_noise
depedencies:
- jpype1
configs:
- {tau: 1, ami_method: 'quantiles', extra_param: 10, zscore: True}
- {tau: 1, ami_method: 'even', extra_param: 10, zscore: True}
Expand Down Expand Up @@ -440,38 +439,35 @@ information:
automutual_info_stats:
base_name: automutual_info_stats
dependencies:
- jpype1
configs:
- {max_tau: 40, est_method: 'gaussian', zscore: True}
- {max_tau: 20, est_method: 'gaussian', zscore: True}
- {max_tau: 40, est_method: 'kraskov1', extra_param: '4', zscore: True}
- {max_tau: 20, est_method: 'kraskov1', extra_param: '4', zscore: True}
- {max_tau: 40, est_method: 'kraskov1', extra_param: 4, zscore: True}
- {max_tau: 20, est_method: 'kraskov1', extra_param: 4, zscore: True}
legacy_name: IN_AutoMutualInfoStats
ordered_args: ['max_tau', 'est_method', 'extra_param']

first_min:
base_name: first_min
dependencies:
- jpype1
configs:
- {min_what: 'ac', zscore: True}
- {min_what: 'mi-gaussian', zscore: True}
- {min_what: 'mi-kraskov2', extra_param: '4', zscore: True}
- {min_what: 'mi-hist', extra_param: '5', zscore: True}
- {min_what: 'mi-hist', extra_param: '10', zscore: True}
- {min_what: 'mi-kraskov2', extra_param: 4, zscore: True}
- {min_what: 'mi-hist', extra_param: 5, zscore: True}
- {min_what: 'mi-hist', extra_param: 10, zscore: True}
legacy_name: CO_FirstMin
ordered_args: ['min_what', 'extra_param']

first_max:
base_name: first_max
depedencies:
- jpype1
configs:
- {max_what: 'ac', zscore: True}
- {max_what: 'mi-gaussian', zscore: True}
- {max_what: 'mi-kraskov2', extra_param: '4', zscore: True}
- {max_what: 'mi-hist', extra_param: '5', zscore: True}
- {max_what: 'mi-hist', extra_param: '10', zscore: True}
- {max_what: 'mi-kraskov2', extra_param: 4, zscore: True}
- {max_what: 'mi-hist', extra_param: 5, zscore: True}
- {max_what: 'mi-hist', extra_param: 10, zscore: True}
legacy_name: CO_FirstMin
ordered_args: ['max_what', 'extra_param']

Expand Down
1 change: 0 additions & 1 deletion pyhctsa/configurations/module_configs/correlation.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ correlation:
add_noise:
base_name: add_noise
depedencies:
- jpype1
configs:
- {tau: 1, ami_method: 'quantiles', extra_param: 10, zscore: True}
- {tau: 1, ami_method: 'even', extra_param: 10, zscore: True}
Expand Down
21 changes: 9 additions & 12 deletions pyhctsa/configurations/module_configs/information.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,38 +2,35 @@ information:
automutual_info_stats:
base_name: automutual_info_stats
dependencies:
- jpype1
configs:
- {max_tau: 40, est_method: 'gaussian', zscore: True}
- {max_tau: 20, est_method: 'gaussian', zscore: True}
- {max_tau: 40, est_method: 'kraskov1', extra_param: '4', zscore: True}
- {max_tau: 20, est_method: 'kraskov1', extra_param: '4', zscore: True}
- {max_tau: 40, est_method: 'kraskov1', extra_param: 4, zscore: True}
- {max_tau: 20, est_method: 'kraskov1', extra_param: 4, zscore: True}
legacy_name: IN_AutoMutualInfoStats
ordered_args: ['max_tau', 'est_method', 'extra_param']

first_min:
base_name: first_min
dependencies:
- jpype1
configs:
- {min_what: 'ac', zscore: True}
- {min_what: 'mi-gaussian', zscore: True}
- {min_what: 'mi-kraskov2', extra_param: '4', zscore: True}
- {min_what: 'mi-hist', extra_param: '5', zscore: True}
- {min_what: 'mi-hist', extra_param: '10', zscore: True}
- {min_what: 'mi-kraskov2', extra_param: 4, zscore: True}
- {min_what: 'mi-hist', extra_param: 5, zscore: True}
- {min_what: 'mi-hist', extra_param: 10, zscore: True}
legacy_name: CO_FirstMin
ordered_args: ['min_what', 'extra_param']

first_max:
base_name: first_max
depedencies:
- jpype1
dependencies:
configs:
- {max_what: 'ac', zscore: True}
- {max_what: 'mi-gaussian', zscore: True}
- {max_what: 'mi-kraskov2', extra_param: '4', zscore: True}
- {max_what: 'mi-hist', extra_param: '5', zscore: True}
- {max_what: 'mi-hist', extra_param: '10', zscore: True}
- {max_what: 'mi-kraskov2', extra_param: 4, zscore: True}
- {max_what: 'mi-hist', extra_param: 5, zscore: True}
- {max_what: 'mi-hist', extra_param: 10, zscore: True}
legacy_name: CO_FirstMin
ordered_args: ['max_what', 'extra_param']

Expand Down
Loading
Loading