Skip to content

Subdivision of permutation patterns for multivariate permutation probabilities/entropy #206

Open
@kahaaga

Description

@kahaaga

Describe the feature you'd like to have

Probabilities for multivariate permutation entropy is now computed by doing probabilities(SymbolicPermutation(), x::AbstractDataset{D}). This works by simply finding the permutation patterns for each xᵢ ∈ x and computing the probability of each unique symbol as its relative frequency of occurrence.

In He et al. (2016), they don't simply find permutation patterns of the raw data. They also subdivide the range of the data (after min-max-normalising each variable) into three domains, so that each original permutation pattern turns into multiple possible permutation patterns. See attached screenshot:

Screenshot 2022-12-21 at 14 16 16

Cite scientific papers related to the feature/algorithm

If possible, sketch out an implementation strategy

He et al. (2016)'s procedure is very similar to what Dispersion does. It

  • transforms the input data to some specified range,
  • subdivides that range into discrete bins,
  • encodes the input data according to which bin it falls into.

I'm thinking that there should be a unified way of doing this. For example, if demanding that input data are always normalized to [0, 1] (as it already is for Dispersion), then the user could specify "equidistant binning" or "quantile-based binning" or "whatever-based binning" of the normalized range to obtain any type of subdivision for the permutation patterns. The obvious way to do so, would to use Encoding here too.

Metadata

Metadata

Assignees

No one assigned

    Labels

    discussion-designDiscussion or design mattersenhancementNew feature or request that is non-breakinglow priorityThis isn't particuarly important right now.symbolization

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions