Skip to content

[follow-up] Upgrade period_residual_vs_derr_reference to spectral-shape distance when Derr provides raw scaleogram numerics #217

@eberrigan

Description

@eberrigan

[follow-up] Upgrade period_residual_vs_derr_reference to spectral-shape distance when Derr provides raw scaleogram numerics

Parent epic: #197
Originating PR: #216 / add-circumnutation-tier1-derr-faithful
Type: enhancement / algorithm-upgrade (no code-correctness blocker)
Priority: medium — defer until Derr provides raw input + scaleogram arrays

Summary

PR #6's period_residual_vs_derr_reference trait is computed as (T_nutation_median - DERR_EXPECTED_PERIOD_S) / DERR_EXPECTED_PERIOD_S (fractional period residual against a single rice-anchored constant). This is the cleanest available algorithm given that Derr Sept-2025 outputs in the vault are PDF/PNG only — no raw scaleogram arrays or input signal arrays are available for a richer numerical comparison.

Elizabeth confirmed during PR #6 brainstorming that Derr can be asked for more data when needed. Once Derr provides the underlying numerics, the trait algorithm can be upgraded to a richer spectral-shape distance without changing the trait CSV schema (period_residual_vs_derr_reference remains a single float column; only the algorithm behind it gets richer).

Why

The current single-constant residual algorithm captures only the DOMINANT spectral peak deviation. A richer forensic-match would:

  • Compare the FULL spectral shape (L2 distance between normalized amplitude spectra, restricted to the in-band region).
  • Account for second/third harmonic content (BM2016 §5 predicts harmonic structure in the nutation signal that the current scalar residual ignores).
  • Provide bidirectional confirmation: not just "did we recover Derr's peak period?" but "does our spectrum agree with Derr's across the full frequency range?"

This is purely an upgrade — the current rice-anchored constant is defensible for plate-001 testing today; the spec scenario for "Layer-2 Derr forensic-match acceptance" doesn't lock the internal algorithm.

Required external input

Three asks for Julien Derr (derrjulien@gmail.com or via Talmo Lab introductions):

  1. The raw input signal array that produced the Sept-2025 PDF (5minutes_sample_data.pdf is the PDF rendering; the underlying numpy/matlab array is what we need).
  2. His published CWT scaleogram numerics (the array that rendered _5minutes_wavelets.png).
  3. Optionally: the FFT amplitude spectrum he derived (currently visualized as 5minutes_average_period=3333s.pdf peak at f ≈ 0.0003 Hz).

If items 1+2 are available, we can compute the L2 spectral-shape distance directly. Item 3 is a sanity-check companion.

In scope (when this issue is acted on)

  1. Receive Derr's raw input + scaleogram + optionally his FFT amplitude spectrum (Email Talmo Lab → Julien Derr).
  2. Add a new helper _io.load_derr_reference() or similar that materializes the reference arrays into the test fixtures directory (or vault, if licensing prohibits public commitment).
  3. Update nutation._compute_one_track's step 8 (derived traits) to compute period_residual_vs_derr_reference as an L2 spectral-shape distance instead of the scalar period residual. Likely structure:
    # New: normalized in-band amplitude vector vs Derr's normalized in-band vector
    our_in_band = spectrum[in_band_mask] / np.linalg.norm(spectrum[in_band_mask])
    derr_in_band = DERR_REFERENCE_SPECTRUM[in_band_mask] / np.linalg.norm(...)
    period_residual_vs_derr_reference = float(np.linalg.norm(our_in_band - derr_in_band))
  4. Update the §2.H.3 Layer-2 acceptance threshold to reflect the new algorithm's scale (currently 0.02-0.05 for the fractional-period residual; L2 spectral distance has a different scale).
  5. The trait CSV column does not changeperiod_residual_vs_derr_reference stays a single float; the spec scenario doesn't change. Only the algorithm and the test tolerance shift.

Out of scope

  • Replacing period_residual_vs_derr_reference with a different trait name (the column is API-stable; consumers should not need to re-write joins).
  • Multi-plate Derr reference (the upgrade is for the SINGLE plate-001 forensic match; multi-plate generalization is a separate concern).

Acceptance

  • A short markdown report at docs/circumnutation/derr_layer_2_upgrade_2026-MM-DD.md documenting the new algorithm, the source files Derr provided, and the new test tolerance.
  • Updated nutation._compute_one_track step 8 + updated _constants.py (likely a new DERR_REFERENCE_SPECTRUM_PATH or fixture-path constant + possibly a new DERR_MATCH_TOLERANCE constant).
  • §2.H.3 test updated with the new tolerance + provenance comment.
  • Follow-up OpenSpec change scaffolded under add-circumnutation-derr-layer-2-spectral-shape-upgrade or similar.

Cross-references

  • Originating PR: feat(circumnutation): Tier 1 Derr-faithful trait emission (#215) #216
  • design.md GREEN-phase Reconciliation Appendix in openspec/changes/add-circumnutation-tier1-derr-faithful/design.md (which §2.H.3 was softened from CC-7 ±2% to median ±25% pending this upgrade)
  • Q6 brainstorming decision in the PR Added bases functions #6 sub-issue
  • Theory: docs/circumnutation/theory.md §7.2 ("derr_match_residual" — note: trait renamed to period_residual_vs_derr_reference per S3 round-1 reconciliation; theory.md will be updated in a future doc-only PR)

Labels

enhancement, circumnutation, follow-up, medium-priority

Metadata

Metadata

Assignees

No one assigned

    Labels

    circumnutationPlant circumnutation pipelineenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions