Skip to content

Conversation

@rusandris
Copy link
Contributor

@rusandris rusandris commented Jan 20, 2025

First draft PR. I tried to follow the design ideas mentioned here #424 .
TransferOperator becomes a ProbabilitiesEstimator.

TransferOperator <: ProbabilitiesEstimator #true

First results:

x = [1.0,2.0,1.0,2.0,1.0] #simple dummy time series

#1D counting-based outcome spaces
op = OrdinalPatterns{2}()
vb = ValueBinning(RectangularBinning(5))
d = Dispersion(; c = 3, m = 2, τ = 1)
ue = UniqueElements()

#try with different outcome spaces
transferoperator(op,x)
transferoperator(vb,x)
transferoperator(d,x)
transferoperator(ue,x)

which returns a TransferOperatorApproximation <: ProbabilitiesEstimator .

OP example:

codify(op,x)
    4-element Vector{Int64}:
     1
     2
     1
     2

to = transferoperator(op,x)

to.outcome_space #gives back op
    OrdinalPatterns{2}, with 2 fields:
    encoding = OrdinalPatternEncoding(perm = [2, 1], lt = isless_rand)
    τ = 1
to.outcomes #unique outcomes
  2-element Vector{Int64}:
   1
   2

Most important field is the transfermatrix

to.transfermatrix
  2×2 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
       1.0
   1.0    

probabilities function does work formally in that it gives an output:

probabilities(TransferOperator(),op,x)
   Probabilities{Float64,1} over 2 outcomes
   Outcome(1)  0.6056021883538506
   Outcome(2)  0.3943978116461494

but for some reason, for this particular example, the iterative method used in invariant_measure does not converge.

probabilities(TransferOperator(),op,x;N=4000,tolerance=1e-12) #keyword args are propagated to invariant_measure
   Probabilities{Float64,1} over 2 outcomes
   Outcome(1)  0.9508731468636822
   Outcome(2)  0.04912685313631784

the correct probabs of course would be [0.5,0.5] as the eigenvector corresponding to eigenvalue 1 is [1,1]. This might be a separate issue though.
Let me know what you think.

@Datseris
Copy link
Member

is there any paper in the literature that has generalized the transfer operator like this? Among other outcome spaces? if not, perhaps you can consider making this a paper.

@kahaaga
Copy link
Member

kahaaga commented Jan 21, 2025

is there any paper in the literature that has generalized the transfer operator like this? Among other outcome spaces? if not, perhaps you can consider making this a paper.

At least I am not aware of any such paper. Would be a cool contribution for sure!

@kahaaga
Copy link
Member

kahaaga commented Jan 21, 2025

@rusandris Thanks for the PR! I am super busy at the moment, so I'm not able to review this in detail yet. I'll try to get to it within the next couple of weeks.

the correct probabs of course would be [0.5,0.5] as the eigenvector corresponding to eigenvalue 1 is [1,1]. This might be a separate issue though.
Let me know what you think.

I'll have to do a thorough review dig deeper to understand this issue. I'll ping you when I've tested this out a bit, @rusandris.

@Datseris
Copy link
Member

is there any paper in the literature that has generalized the transfer operator like this? Among other outcome spaces? if not, perhaps you can consider making this a paper.

At least I am not aware of any such paper. Would be a cool contribution for sure!

Just to be clear: i have no capacity to be involved in such a paper, but it is worth discussing with your supervisor Andras.

@rusandris
Copy link
Contributor Author

I'll have to do a thorough review dig deeper to understand this issue. I'll ping you when I've tested this out a bit, @rusandris.

I looked inside invariant_measure to see what happens exactly.
Adding some @show statements

julia> probabilities(TransferOperator(),op,x;N=5,tolerance=1e-12)

  distribution = [0.28892942757860407 0.7110705724213959]
  distribution = [0.7110705724213959 0.28892942757860407]
  distance = 0.7778172853930047
  distribution = [0.28892942757860407 0.7110705724213959]
  distance = 0.7778172853930047
  distribution = [0.7110705724213959 0.28892942757860407]
  distance = 0.7778172853930047
  distribution = [0.28892942757860407 0.7110705724213959]
  distance = 0.7778172853930047
  distribution = [0.7110705724213959 0.28892942757860407]
  distance = 0.7778172853930047

   Probabilities{Float64,1} over 2 outcomes
     Outcome(1)  0.7110705724213959
     Outcome(2)  0.28892942757860407

reveals of course that it works as it should. Since the transition matrix looks like this

to.transfermatrix
  2×2 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
       1.0
   1.0    

the iterative method keeps flipping the initial random values inside the distribution. This wouldn't converge, unless the initial distribution is already given as

probabilities(RelativeAmount(),op,x)
 Probabilities{Float64,1} over 2 outcomes
 Outcome(1)  0.5
 Outcome(2)  0.5

@Datseris
Copy link
Member

Datseris commented Jan 9, 2026

I've seeen some new commits here, but it's been more than a year since i've looked at this and have forgotten many things, let me know if there is something in particular you'd like help with, or to discuss, and I'll bring myself back up to speed!

@rusandris
Copy link
Contributor Author

rusandris commented Jan 10, 2026

Thanks, great to hear! I'm still working on it, but in short TransferOperator works like this:

logistic_rule(x, p, n) = SVector{1}(p[1] * x[1] * (1.0 - x[1]))
p = [r]
ds = DeterministicIteratedMap(logistic_rule, [0.4], p)
x, t = trajectory(ds, 10^3; Ttr=10^4)
b = ValueBinning(FixedRectangularBinning(range(0,1;length=11)))
probabilities(TransferOperator(),b,x)
 Probabilities{Float64,1} over 10 outcomes
  1  0.21500000000000005
  2  0.08700000000000001
  3  0.06900000000000002
  4  0.06799999999999999
  5  0.06599999999999995
  6  0.06999999999999998
  7  0.053999999999999986
  8  0.081
  9  0.08999999999999998
 10  0.19999999999999993

through invariant_measure. I used binning here but the outcome space can be ordinal patterns etc.
Also, there are two methods of getting an invariant measure \rho using the transfermatrix: iterative (as it was before, this is the default) and the eigenvector (newly added). (Right now the eigen method uses eigsolve from KrylovKit, since it's said to work well for extremal eigenvalues, and we're after the vector corresponding to the largest eigenvalue of the transfermatrix which is 1. But this can change if we want to avoid adding KrylovKit as a dependency).

The eigenvector method can be better for example when the time series is periodic:

 ds = DeterministicIteratedMap(logistic_rule, [0.4], [3.84])
 x, t = trajectory(ds, 10^3; Ttr=10^4)

 probabilities(TransferOperator(),b,x;method=:iterate)
   Probabilities{Float64,1} over 3 outcomes
    2  0.09340928604503983
    5  0.23372777459244914
   10  0.6728629393625111

probabilities(TransferOperator(),b,x;method=:eigen)
   Probabilities{Float64,1} over 3 outcomes
    2  0.33333333333333337
    5  0.3333333333333334
   10  0.33333333333333326

Let me know how this looks for you.

@kahaaga
Copy link
Member

kahaaga commented Jan 10, 2026

@rusandris, great stuff! 🤘Super nice that the eigenvector approach works well for periodic time series. This is a major improvement which may have advantages in e.g. reducing false negatives in settings where e.g. extensive pseudoperiodic surrogate testing is done for some statistic. For that reason I vote for keeping the dependency.

One comment though: following the convention from the test of the packaged: any keywords that controls the probabilities estimation should be part of the discretization type ('TransferOperator'), not as a keyword to the 'probabilities' function.

@rusandris
Copy link
Contributor Author

One comment though: following the convention from the test of the packaged: any keywords that controls the probabilities estimation should be part of the discretization type ('TransferOperator'), not as a keyword to the 'probabilities' function.

My thinking was the following (and I'm sure there are some workarounds/better ways to do this): the method keyword argument in probabilities is to simplify things on the user side. The value of this keyword argument is then used to decide if TransferOperatorApproximation should be wrapped inside TransferOperatorApproximationIterative or TransferOperatorApproximationEigen in order to dispatch invariantmeasure on one of them eventually.

probabilities(TransferOperatorApproximationIterative(),outcome_space,x)
probabilities(TransferOperatorApproximationEigen(),outcome_space,x)

seemed too verbose and difficult. These types are supposed to be internal and are only used once probabilities is called (which might or might not happen, maybe the user only wanted to extract the transition probabilities from the transfermatrix).
Now, I guess method can be moved inside TO and used to dispatch like

 probabilities(TransferOperator(params...;method=:eigen),b,x)

which is what you mean, right?

@kahaaga
Copy link
Member

kahaaga commented Jan 15, 2026

@rusandris Thanks again for your great work! I'll do a thorough review of this in the coming days. In order to not give conflicting feedback here, I want to familiarize myself properly with the changes before I respond in detail. But in short:

Breaking changes

I see that there is a breaking change (converting TransferOperator from an OutcomeSpace to a ProbabilitiesEstimator.
We do not want to introduce any breaking changes in the package for what is happening here: introducing a new ProbabilitiesEstimator. We can easily introduce TransferOperatorApproximation <: ProbabilitiesEstimator, just as long as we don't re-define the original TransferOperator (which then should be deprecated instead).

Internal dispatch

probabilities(TransferOperator(params...;method=:eigen),b,x)
which is what you mean, right?

Something like that! For consistency with the rest of the code base, I think the following would work:

probabilities(probest::TransferOperatorApproximation(::ApproximationMethod), o::OutcomeSpace, x)

where ApproximationMethod is either TransferOperatorApproximationIterative or TransferOperatorApproximationEigen. Internally, depending on the method, this dispatches to

probabilities(::TransferOperatorApproximationIterative,outcome_space,x)
probabilities(::TransferOperatorApproximationEigen,outcome_space,x)

I just need to see how it relates to the manual extraction of the transfer matrix in your suggested code.

Why we don't want extra keywords in the function

We want the base functions like probabilities in the package to be "pure", in the sense that they predictably operate on combinations of arguments of different types, without any configuration at the function level.

If we were to break with this convention for the transfer operator, then there would be hundreds of possible ways to discretize and estimate probabilities from data using one unified syntax, EXCEPT for the transfer operator approximation, which has an extra keyword for probabilities. We want to avoid special cases and keep the unified approach: any configuration arguments always go in the relevant types (e.g. struct TransferOperatorApproximation <: ProbabilitiesEstimator; param1; param2 ... end).

We set reasonable, well-documented default behaviours, but leave the user the option to be as granular as they want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants