You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[research] Tier-3 candidate: multi-timescale transfer-operator framework for behavioral-state discovery in root circumnutation (Kaur, Jain & Berman, 2026)
Parent epic:#197 Type: research / future tier exploration (no current PR blocker) Priority: low — exploratory; revisit after Tier 1 + Tier 2 land Originating discussion: ad-hoc paper review on 2026-06-03
Paper
Kaur R., Jain K., Berman G. J. (2026). Using timescale as a state coordinate reveals the metastable geometry of behavior. bioRxiv preprint, doi:10.64898/2026.05.25.727718 (posted 2026-05-28, Departments of Physics and Biology, Emory University). bioRxiv link
What the paper proposes
A pipeline for recovering slow latent behavioral structure from multivariate postural time series when fast and slow processes are tightly intertwined (i.e., the regime where fixed-timescale delay embeddings fail):
Morlet wavelet decomposition of each measurement channel into N_f dyadically spaced frequency bands — same primitive as sleap_roots.circumnutation.temporal_cwt.compute_scaleogram.
Delay-embed short windows of the time-frequency representation (so each window encodes both instantaneous posture and how that posture is being expressed across fast/slow temporal scales).
k-means cluster the embedded windows into N micro-states.
Estimate a transfer operator (Markov transition matrix) on the cluster sequence at a chosen lag τ.
Spectral decomposition of the operator — leading non-trivial eigenvectors are slow collective modes.
Generalized PCCA (G-PCCA) to extract metastable basins in the non-reversible case (since behavioral transitions typically violate detailed balance).
Key prediction. For a reversible Markov chain with M metastable basins separated from the bulk spectrum by a gap, the spectral theory of metastability predicts an arms-and-hub geometry: cluster centroids in the leading (M-1)-dimensional eigenvector space arrange themselves into M linear arms radiating from a stationary-weighted central hub, one arm per basin. The paper makes this falsifiable via four explicit diagnostic criteria:
(i) clear ratio gap in the leading eigenvalue spectrum after some M ≥ 2;
(ii) leading non-trivial eigenvectors with high participation ratio PR_k = (Σ_i φ_k(i)²)² / Σ_i φ_k(i)⁴ ≫ 1 across clusters;
(iii) simplex-like arms-and-hub geometry in the leading (M-1)-dimensional eigenspace;
(iv) held-out cross-validation: the fitted Markov model beats a memoryless null at the same basin count, plateauing at M.
Validated on (a) a stochastically driven Lorenz system with a hidden bistable driver (positive control), (b) C. elegans locomotion (recovers run/pirouette), and (c) D. melanogaster freely-moving flies (recovers four behavioral basins + heavy-tailed residence times).
Why this might be relevant to the circumnutation program
The wavelet substrate already exists in this codebase via PR #5 (temporal_cwt). PR #6 currently extracts a single ridge from the scaleogram and emits scalar interpretable traits (T_nutation_median, A_nutation_envelope_max, band_power_ratio, is_nutating, etc.) anchored to BM2016 + Derr Sept-2025. That's the right level for Tier 1.
The Berman framework asks a categorically different question: instead of "what is the dominant nutation period and amplitude?", it asks "does the root have discrete behavioral states, and if so how many?" — without prespecifying categories. Plausible candidate basins for a root system:
active circumnutation (sustained ~3333 s oscillation);
gravitropic / tropic deflection (slow drift in growth-axis-aligned direction);
transient quiescence (low-amplitude period with weak band-power);
recovery (post-perturbation re-entry into oscillation).
The arms-and-hub geometry would surface these empirically, and criterion (i) — the eigenvalue ratio gap — explicitly tells you when the framework does NOT apply, which is itself a falsifiable scientific result. If circumnutation is a single sustained oscillation with no metastable structure, criterion (i) fails and the project has a clean null result.
Overlap. Same Morlet wavelet substrate. temporal_cwt.ScaleogramResult is exactly the time-frequency representation the Berman pipeline consumes at step 1.
Not an overlap — explicit non-goal for PR #6. PR #6 collapses the scaleogram to a single ridge and emits scalar traits. The Berman pipeline treats the full scaleogram as state-space coordinates and infers a transition matrix. These are different epistemic acts. PR #6 should ship as planned; this issue is for a future research tier, not a Tier 1 revision.
Honest concerns to validate before committing to this direction
Multivariate input requirement. Kaur et al. apply this to multivariate postural state — C. elegans eigenworm coefficients (5 channels), D. melanogaster joint angles (multiple per individual). A single root tip's (x, y) trajectory is genuinely thin. Likely required mitigations to make this work for circumnutation:
derive multiple kinematic channels (lateral position, longitudinal position, instantaneous velocity components, possibly tangent-axis angle ψ_g from BM2016 Eq. 20) and stack them as a vector input;
or use multi-plant cross-correlation features as the multivariate signal;
or move to multi-tracked-point-along-root (would require a different SLEAP model + skeleton).
Sample size for Markov matrix estimation. Kaur et al. use N=1300 clusters from many long high-rate recordings. The current Nipponbare proofread fixture has 6 tracks at 300 s cadence over a few hours per plate — that's ~70–150 samples per track. This is almost certainly insufficient to estimate a stable transfer operator. Likely required: a longer-duration acquisition protocol (e.g., 24h+ at sub-Nyquist cadence), or pooling across plates within a condition.
Timescale separation may be too narrow. The Berman framework's strength is recovering hidden slow modes invisible to fixed-timescale delay embeddings. In circumnutation, the period of interest (~3333 s) is only ~10× the cadence (300 s) and the growth-axis drift timescale is at most another ~10× slower. Their fly system has orders of magnitude of separation between fast kinematics and behavioral states. The narrower gap here may mean fixed-timescale delay embedding already works (i.e., criterion (i) does succeed but the multi-timescale operator only marginally improves on a vanilla delay-embedding analysis — testable empirically).
Concrete cheap experiment to scope feasibility (suggested first step)
Before committing to any new tier or production code, run a one-off research notebook on the existing plate-001 Nipponbare proofread fixture:
Stack derived channels (e.g., lateral coordinate from _geometry.project_to_growth_axis_perpendicular, raw tip_x, tip_y, and short-window velocity components) as a multivariate input per track.
Run temporal_cwt.compute_scaleogram per channel (this already exists).
Delay-embed across channels into a higher-dimensional state space (Cao's E1 saturation for embedding dimension d).
k-means cluster (start small: N = 20–100 micro-states, given the small sample size).
Build the transfer operator at a chosen lag τ.
Check diagnostic criterion (i) first — is there a clear ratio gap in the eigenvalue spectrum after some M ≥ 2? If no, the framework correctly tells you root behavior is not metastable in this sense, and that is itself a publishable null finding (cite Kaur/Jain/Berman 2026 §II.B diagnostic criteria).
If criterion (i) passes, check (ii) participation ratio and (iii) arms-and-hub geometry. If those also pass, scope a real Tier 3 PR.
Estimated effort: a research notebook in docs/circumnutation/research/ (under the research:investigate skill's dated-folder convention). 1-2 days of exploratory work. Does NOT need a new OpenSpec change until criterion (i) passes.
Acceptance for THIS issue (the research note, not the production tier)
A dated research investigation under docs/circumnutation/research/YYYY-MM-DD_berman_multi_timescale_feasibility/ containing:
the notebook with the 4 diagnostic criteria evaluated on plate-001 (and any additional proofread fixtures available at evaluation time);
a 1-page summary of whether criterion (i) passes, fails, or is ambiguous;
if passes: a sketch of the arms-and-hub geometry plot, and a scoped follow-up OpenSpec proposal for add-circumnutation-tier3-behavioral-state-discovery (or similar);
if fails: a short results section documenting the null result with the eigenvalue spectrum plot — closes this issue and informs the program that root nutation does not exhibit metastability in the Berman sense at current data scales.
Out of scope for this issue
Any change to PR Added bases functions #6 / Tier 1 trait emission. Tier 1 is biologically interpretable scalar emission anchored to the Derr oracle; this exploration is orthogonal.
Any new acquisition protocol. The feasibility check uses existing fixtures only.
Any production code in sleap_roots/. Research notebook only. Production code is downstream of a passing criterion (i) and a scoped OpenSpec proposal.
G-PCCA implementation. The first feasibility check only needs the eigenvalue spectrum + participation ratio; G-PCCA is only needed once arms-and-hub geometry is being characterized in earnest.
Theoretical foundation citation: Kaur R., Jain K., Berman G. J. (2026). doi:10.64898/2026.05.25.727718
Related prior work cited in the Berman paper: G-PCCA (Reuter et al.), Cao's E1 embedding dimension criterion (Cao 1997), Berman et al. behavioral mapping (refs [12], [17], [18] in the paper).
[research] Tier-3 candidate: multi-timescale transfer-operator framework for behavioral-state discovery in root circumnutation (Kaur, Jain & Berman, 2026)
Parent epic: #197
Type: research / future tier exploration (no current PR blocker)
Priority: low — exploratory; revisit after Tier 1 + Tier 2 land
Originating discussion: ad-hoc paper review on 2026-06-03
Paper
Kaur R., Jain K., Berman G. J. (2026). Using timescale as a state coordinate reveals the metastable geometry of behavior. bioRxiv preprint, doi:10.64898/2026.05.25.727718 (posted 2026-05-28, Departments of Physics and Biology, Emory University). bioRxiv link
What the paper proposes
A pipeline for recovering slow latent behavioral structure from multivariate postural time series when fast and slow processes are tightly intertwined (i.e., the regime where fixed-timescale delay embeddings fail):
sleap_roots.circumnutation.temporal_cwt.compute_scaleogram.Key prediction. For a reversible Markov chain with M metastable basins separated from the bulk spectrum by a gap, the spectral theory of metastability predicts an arms-and-hub geometry: cluster centroids in the leading (M-1)-dimensional eigenvector space arrange themselves into M linear arms radiating from a stationary-weighted central hub, one arm per basin. The paper makes this falsifiable via four explicit diagnostic criteria:
Validated on (a) a stochastically driven Lorenz system with a hidden bistable driver (positive control), (b) C. elegans locomotion (recovers run/pirouette), and (c) D. melanogaster freely-moving flies (recovers four behavioral basins + heavy-tailed residence times).
Why this might be relevant to the circumnutation program
The wavelet substrate already exists in this codebase via PR #5 (
temporal_cwt). PR #6 currently extracts a single ridge from the scaleogram and emits scalar interpretable traits (T_nutation_median,A_nutation_envelope_max,band_power_ratio,is_nutating, etc.) anchored to BM2016 + Derr Sept-2025. That's the right level for Tier 1.The Berman framework asks a categorically different question: instead of "what is the dominant nutation period and amplitude?", it asks "does the root have discrete behavioral states, and if so how many?" — without prespecifying categories. Plausible candidate basins for a root system:
The arms-and-hub geometry would surface these empirically, and criterion (i) — the eigenvalue ratio gap — explicitly tells you when the framework does NOT apply, which is itself a falsifiable scientific result. If circumnutation is a single sustained oscillation with no metastable structure, criterion (i) fails and the project has a clean null result.
Where the overlap with PR #6 is and is not
Overlap. Same Morlet wavelet substrate.
temporal_cwt.ScaleogramResultis exactly the time-frequency representation the Berman pipeline consumes at step 1.Not an overlap — explicit non-goal for PR #6. PR #6 collapses the scaleogram to a single ridge and emits scalar traits. The Berman pipeline treats the full scaleogram as state-space coordinates and infers a transition matrix. These are different epistemic acts. PR #6 should ship as planned; this issue is for a future research tier, not a Tier 1 revision.
Honest concerns to validate before committing to this direction
Multivariate input requirement. Kaur et al. apply this to multivariate postural state —
C. eleganseigenworm coefficients (5 channels), D. melanogaster joint angles (multiple per individual). A single root tip's (x, y) trajectory is genuinely thin. Likely required mitigations to make this work for circumnutation:Sample size for Markov matrix estimation. Kaur et al. use N=1300 clusters from many long high-rate recordings. The current Nipponbare proofread fixture has 6 tracks at 300 s cadence over a few hours per plate — that's ~70–150 samples per track. This is almost certainly insufficient to estimate a stable transfer operator. Likely required: a longer-duration acquisition protocol (e.g., 24h+ at sub-Nyquist cadence), or pooling across plates within a condition.
Timescale separation may be too narrow. The Berman framework's strength is recovering hidden slow modes invisible to fixed-timescale delay embeddings. In circumnutation, the period of interest (~3333 s) is only ~10× the cadence (300 s) and the growth-axis drift timescale is at most another ~10× slower. Their fly system has orders of magnitude of separation between fast kinematics and behavioral states. The narrower gap here may mean fixed-timescale delay embedding already works (i.e., criterion (i) does succeed but the multi-timescale operator only marginally improves on a vanilla delay-embedding analysis — testable empirically).
Concrete cheap experiment to scope feasibility (suggested first step)
Before committing to any new tier or production code, run a one-off research notebook on the existing plate-001 Nipponbare proofread fixture:
_geometry.project_to_growth_axis_perpendicular, rawtip_x,tip_y, and short-window velocity components) as a multivariate input per track.temporal_cwt.compute_scaleogramper channel (this already exists).Estimated effort: a research notebook in
docs/circumnutation/research/(under theresearch:investigateskill's dated-folder convention). 1-2 days of exploratory work. Does NOT need a new OpenSpec change until criterion (i) passes.Acceptance for THIS issue (the research note, not the production tier)
docs/circumnutation/research/YYYY-MM-DD_berman_multi_timescale_feasibility/containing:add-circumnutation-tier3-behavioral-state-discovery(or similar);Out of scope for this issue
sleap_roots/. Research notebook only. Production code is downstream of a passing criterion (i) and a scoped OpenSpec proposal.Cross-references
add-circumnutation-tier1-derr-faithful(current branch).sleap_roots/circumnutation/temporal_cwt.py(from PR Add test lines in ellipse module #5 / feat(circumnutation): temporal CWT machinery (#212) #213).Labels
enhancement,circumnutation,follow-up,research