Currently, codify applies some discretisation to an input vector of observation which encodes data points in a sliding-window-manner (e.g. over embedding points constructed from the input data). Each data point is encoded into an integer. It would be nice to have a function that also returns the outcomes corresponding to the integer.
This would allow much nicer Counts/Probabilities upstream:
using Associations
x, y = StateSpaceSet(rand(1000, 2)), StateSpaceSet(rand(1000, 3))
# min/max of the `rand` call is 0 and 1
precise = true # precise bin edges
r = range(0, 1; length = 3)
binning = FixedRectangularBinning(r, dimension(x), precise)
encoding_x = RectangularBinEncoding(binning, x)
encoding_y = CombinationEncoding(RelativeMeanEncoding(0.0, 1, n = 2), OrdinalPatternEncoding(3))
discretization = CodifyPoints(encoding_x, encoding_y)
# now estimate probabilities
probabilities(discretization, x, y)
julia> probabilities(discretization, x, y)
4×12 Probabilities{Float64,2}
6 8 12 11 9 1 10 2 4 7 3 5
1 0.024 0.022 0.03 0.019 0.03 0.029 0.018 0.02 0.023 0.025 0.023 0.022
4 0.023 0.027 0.017 0.019 0.022 0.022 0.016 0.019 0.029 0.014 0.017 0.014
3 0.021 0.019 0.015 0.018 0.021 0.028 0.021 0.018 0.016 0.021 0.027 0.019
2 0.013 0.033 0.016 0.024 0.019 0.02 0.017 0.022 0.018 0.02 0.01 0.02
It would be nice to know what these integers in the marginals correspond to.
This isn't by any means necessary, but it would be a nice addition.
Currently,
codifyapplies some discretisation to an input vector of observation which encodes data points in a sliding-window-manner (e.g. over embedding points constructed from the input data). Each data point is encoded into an integer. It would be nice to have a function that also returns the outcomes corresponding to the integer.This would allow much nicer
Counts/Probabilitiesupstream:It would be nice to know what these integers in the marginals correspond to.
This isn't by any means necessary, but it would be a nice addition.