Skip to content

[research] Tier-3 candidate: multi-timescale transfer-operator framework for behavioral-state discovery in root circumnutation (Kaur, Jain & Berman, 2026) #221

@eberrigan

Description

@eberrigan

[research] Tier-3 candidate: multi-timescale transfer-operator framework for behavioral-state discovery in root circumnutation (Kaur, Jain & Berman, 2026)

Parent epic: #197
Type: research / future tier exploration (no current PR blocker)
Priority: low — exploratory; revisit after Tier 1 + Tier 2 land
Originating discussion: ad-hoc paper review on 2026-06-03

Paper

Kaur R., Jain K., Berman G. J. (2026). Using timescale as a state coordinate reveals the metastable geometry of behavior. bioRxiv preprint, doi:10.64898/2026.05.25.727718 (posted 2026-05-28, Departments of Physics and Biology, Emory University). bioRxiv link

What the paper proposes

A pipeline for recovering slow latent behavioral structure from multivariate postural time series when fast and slow processes are tightly intertwined (i.e., the regime where fixed-timescale delay embeddings fail):

  1. Morlet wavelet decomposition of each measurement channel into N_f dyadically spaced frequency bands — same primitive as sleap_roots.circumnutation.temporal_cwt.compute_scaleogram.
  2. Delay-embed short windows of the time-frequency representation (so each window encodes both instantaneous posture and how that posture is being expressed across fast/slow temporal scales).
  3. k-means cluster the embedded windows into N micro-states.
  4. Estimate a transfer operator (Markov transition matrix) on the cluster sequence at a chosen lag τ.
  5. Spectral decomposition of the operator — leading non-trivial eigenvectors are slow collective modes.
  6. Generalized PCCA (G-PCCA) to extract metastable basins in the non-reversible case (since behavioral transitions typically violate detailed balance).

Key prediction. For a reversible Markov chain with M metastable basins separated from the bulk spectrum by a gap, the spectral theory of metastability predicts an arms-and-hub geometry: cluster centroids in the leading (M-1)-dimensional eigenvector space arrange themselves into M linear arms radiating from a stationary-weighted central hub, one arm per basin. The paper makes this falsifiable via four explicit diagnostic criteria:

  • (i) clear ratio gap in the leading eigenvalue spectrum after some M ≥ 2;
  • (ii) leading non-trivial eigenvectors with high participation ratio PR_k = (Σ_i φ_k(i)²)² / Σ_i φ_k(i)⁴ ≫ 1 across clusters;
  • (iii) simplex-like arms-and-hub geometry in the leading (M-1)-dimensional eigenspace;
  • (iv) held-out cross-validation: the fitted Markov model beats a memoryless null at the same basin count, plateauing at M.

Validated on (a) a stochastically driven Lorenz system with a hidden bistable driver (positive control), (b) C. elegans locomotion (recovers run/pirouette), and (c) D. melanogaster freely-moving flies (recovers four behavioral basins + heavy-tailed residence times).

Why this might be relevant to the circumnutation program

The wavelet substrate already exists in this codebase via PR #5 (temporal_cwt). PR #6 currently extracts a single ridge from the scaleogram and emits scalar interpretable traits (T_nutation_median, A_nutation_envelope_max, band_power_ratio, is_nutating, etc.) anchored to BM2016 + Derr Sept-2025. That's the right level for Tier 1.

The Berman framework asks a categorically different question: instead of "what is the dominant nutation period and amplitude?", it asks "does the root have discrete behavioral states, and if so how many?" — without prespecifying categories. Plausible candidate basins for a root system:

  • active circumnutation (sustained ~3333 s oscillation);
  • gravitropic / tropic deflection (slow drift in growth-axis-aligned direction);
  • transient quiescence (low-amplitude period with weak band-power);
  • recovery (post-perturbation re-entry into oscillation).

The arms-and-hub geometry would surface these empirically, and criterion (i) — the eigenvalue ratio gap — explicitly tells you when the framework does NOT apply, which is itself a falsifiable scientific result. If circumnutation is a single sustained oscillation with no metastable structure, criterion (i) fails and the project has a clean null result.

Where the overlap with PR #6 is and is not

Overlap. Same Morlet wavelet substrate. temporal_cwt.ScaleogramResult is exactly the time-frequency representation the Berman pipeline consumes at step 1.

Not an overlap — explicit non-goal for PR #6. PR #6 collapses the scaleogram to a single ridge and emits scalar traits. The Berman pipeline treats the full scaleogram as state-space coordinates and infers a transition matrix. These are different epistemic acts. PR #6 should ship as planned; this issue is for a future research tier, not a Tier 1 revision.

Honest concerns to validate before committing to this direction

  1. Multivariate input requirement. Kaur et al. apply this to multivariate postural state — C. elegans eigenworm coefficients (5 channels), D. melanogaster joint angles (multiple per individual). A single root tip's (x, y) trajectory is genuinely thin. Likely required mitigations to make this work for circumnutation:

    • derive multiple kinematic channels (lateral position, longitudinal position, instantaneous velocity components, possibly tangent-axis angle ψ_g from BM2016 Eq. 20) and stack them as a vector input;
    • or use multi-plant cross-correlation features as the multivariate signal;
    • or move to multi-tracked-point-along-root (would require a different SLEAP model + skeleton).
  2. Sample size for Markov matrix estimation. Kaur et al. use N=1300 clusters from many long high-rate recordings. The current Nipponbare proofread fixture has 6 tracks at 300 s cadence over a few hours per plate — that's ~70–150 samples per track. This is almost certainly insufficient to estimate a stable transfer operator. Likely required: a longer-duration acquisition protocol (e.g., 24h+ at sub-Nyquist cadence), or pooling across plates within a condition.

  3. Timescale separation may be too narrow. The Berman framework's strength is recovering hidden slow modes invisible to fixed-timescale delay embeddings. In circumnutation, the period of interest (~3333 s) is only ~10× the cadence (300 s) and the growth-axis drift timescale is at most another ~10× slower. Their fly system has orders of magnitude of separation between fast kinematics and behavioral states. The narrower gap here may mean fixed-timescale delay embedding already works (i.e., criterion (i) does succeed but the multi-timescale operator only marginally improves on a vanilla delay-embedding analysis — testable empirically).

Concrete cheap experiment to scope feasibility (suggested first step)

Before committing to any new tier or production code, run a one-off research notebook on the existing plate-001 Nipponbare proofread fixture:

  1. Stack derived channels (e.g., lateral coordinate from _geometry.project_to_growth_axis_perpendicular, raw tip_x, tip_y, and short-window velocity components) as a multivariate input per track.
  2. Run temporal_cwt.compute_scaleogram per channel (this already exists).
  3. Delay-embed across channels into a higher-dimensional state space (Cao's E1 saturation for embedding dimension d).
  4. k-means cluster (start small: N = 20–100 micro-states, given the small sample size).
  5. Build the transfer operator at a chosen lag τ.
  6. Check diagnostic criterion (i) first — is there a clear ratio gap in the eigenvalue spectrum after some M ≥ 2? If no, the framework correctly tells you root behavior is not metastable in this sense, and that is itself a publishable null finding (cite Kaur/Jain/Berman 2026 §II.B diagnostic criteria).
  7. If criterion (i) passes, check (ii) participation ratio and (iii) arms-and-hub geometry. If those also pass, scope a real Tier 3 PR.

Estimated effort: a research notebook in docs/circumnutation/research/ (under the research:investigate skill's dated-folder convention). 1-2 days of exploratory work. Does NOT need a new OpenSpec change until criterion (i) passes.

Acceptance for THIS issue (the research note, not the production tier)

  • A dated research investigation under docs/circumnutation/research/YYYY-MM-DD_berman_multi_timescale_feasibility/ containing:
    • the notebook with the 4 diagnostic criteria evaluated on plate-001 (and any additional proofread fixtures available at evaluation time);
    • a 1-page summary of whether criterion (i) passes, fails, or is ambiguous;
    • if passes: a sketch of the arms-and-hub geometry plot, and a scoped follow-up OpenSpec proposal for add-circumnutation-tier3-behavioral-state-discovery (or similar);
    • if fails: a short results section documenting the null result with the eigenvalue spectrum plot — closes this issue and informs the program that root nutation does not exhibit metastability in the Berman sense at current data scales.

Out of scope for this issue

  • Any change to PR Added bases functions #6 / Tier 1 trait emission. Tier 1 is biologically interpretable scalar emission anchored to the Derr oracle; this exploration is orthogonal.
  • Any new acquisition protocol. The feasibility check uses existing fixtures only.
  • Any production code in sleap_roots/. Research notebook only. Production code is downstream of a passing criterion (i) and a scoped OpenSpec proposal.
  • G-PCCA implementation. The first feasibility check only needs the eigenvalue spectrum + participation ratio; G-PCCA is only needed once arms-and-hub geometry is being characterized in earnest.

Cross-references

Labels

enhancement, circumnutation, follow-up, research

Metadata

Metadata

Assignees

No one assigned

    Labels

    circumnutationPlant circumnutation pipelineenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions