Skip to content

Commit 3ca46e3

Browse files
JohnCCarterclaude
andcommitted
docs(research): main-quest reset + north-star guardrail (stop the mechanics drift)
Re-anchor the selection-learning line to its original goal: learn how the human selects meaningful fib legs/ranges and draws Fib like a human analyst (facit = ground truth) -- NOT explaining detector/snapping/measurement geometry. - What directly helps the main quest: the human's leg choice is partly learnable on 4H (cleaner legs, lift +0.052), live-available, not a detection problem (recall ~0.90), lives in the leg/range gestalt -- but agreement is LOW (AP ~0.057 vs ~0.83) and thin (one feature), so the model does not yet draw like the human. - Control/mechanics (prominence-family, k-sweep, W-gap, artifact-probe, mechanics + flip) added rigor, not capability -- the mechanics/flip work was the drift. - PARK: artifact/snapping/net-path mechanics, matched-null / detector-independent universe (gate not met), further detector-geometry; exclusivity only if it improves selection. - Next step IF it serves the goal: enrich the selection model toward the human's meaningful-leg criteria and measure facit-agreement, behind a blind lock. Else PARK modeling and return to the human BTC top-down labeling main quest. Binding north-star guardrail added to handoff Current Focus + a dedicated reset doc: every future selection-learning step must first answer "does this improve human-like leg/range selection vs the facit?" -- if no, do not start it. Docs-only, no claim, no code/run, no matched-null, no new universe, no Genesis/1H/ETH/refresh. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 8a92844 commit 3ca46e3

2 files changed

Lines changed: 85 additions & 0 deletions

File tree

docs/research_wiki/handoff.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,13 @@ append-only trail lives in [log.md](log.md).
55

66
## Current Focus
77

8+
> **NORTH STAR (binding — no drift):** the selection-learning line exists to *learn how the human
9+
> selects meaningful fib legs/ranges and draws Fib like a human analyst (facit = ground truth)***not**
10+
> to explain detector/snapping/measurement geometry. Every selection-learning step must first answer
11+
> *"does this improve the model's ability to select human-like legs/ranges vs the facit?"* — if no,
12+
> don't start it; park it. Controls + mechanics are **DONE/PARKED**
13+
> ([main-quest reset](reviews/btc-fib-selection-learning-main-quest-reset-20260624.md)).
14+
815
**BTC monthly-first top-down protocol** — re-labeling on BTC/USD only after the
916
**2026-06-09 log-scale + profile reset** (prior linear / 0.236 labels archived).
1017

@@ -22,6 +29,13 @@ append-only trail lives in [log.md](log.md).
2229

2330
## Recent Changes
2431

32+
- **2026-06-24 Fib SELECTION-LEARNING — MAIN-QUEST RESET (docs-only).** Stop the mechanics drift,
33+
re-anchor to the north star (above). Controls/mechanics (artifact-probe, snapping/net-path mechanics,
34+
flip) are **DONE; matched-null / detector-geometry side-quests PARKED.** Next step only if it improves
35+
the model's human-like leg/range selection vs facit (behind a blind lock); else park the modeling and
36+
return to the human BTC top-down labeling main quest.
37+
[Main-quest reset](reviews/btc-fib-selection-learning-main-quest-reset-20260624.md).
38+
2539
- **2026-06-22 Fib SELECTION-LEARNING W-gap study — BUILT + module split, RUN PENDING (home).** Commit 2 of side-quest #1, built to the [W-gap LOCK](reviews/btc-fib-selection-learning-w-gap-lock-20260622.md) (`4f47d8e`): `gap(k)=AP(retro-W)−AP(live-k)` on identical rows, embargo=W, L5 verdict. New `research/selection_learning_gap.py` (+5 tests); W-gap code split out to keep `selection_learning.py` under the §6 size cap (was 995 lines); flushed-stderr `_progress` logging in `build_candidates`+`build_retro_features` so a long run is never blind (result-neutral). **Run NOT executed** — inherent ~2-3h per-endpoint-detect cost on the ~20k-bar 4h frame (leakage-bearing truncation, no legal shortcut); to run at home (see Next tracks). No gap results, no verdict. Commit `884d4c0`, gates green (pytest 549, cov 75%).
2640
- **2026-06-18 Fib SELECTION-LEARNING k-sweep {0,3,6,12} (4h) → `k_stable_live_selection_signal`.** Mandatory confirmation-buffer sweep (live-only), locked prominence-FAMILY survival rule (powered AND CI excludes 0 vs **every** §6 baseline — magnitude + prominence A/B). **k=0 degenerate** (0 candidates, reachable 0.0, unpowered — *not interpretable*, excluded); **k=3/6/12 all powered and survive** the locked family (`p_one_sided lift≤0 = 0/2000` throughout; lowest CI floor k=12 vs prom-sum 0.025). ≥2 survivors → cross-k verdict **`k_stable_live_selection_signal`**: the lead is **not** a narrow-buffer artifact. **Modest framing holds:** `cleanliness` still dominates (~0.20) at every powered k; at k=12 `scale_confluence` enters at ~0.13 only as a **secondary hint** (causally available there), not a second pillar; AP rises only 0.057→0.066, far under the 0.83 ceiling — **still single-feature, NOT a reproduction, no edge/behaviour/backtest/Genesis claim**; 1M/1w/1d **underpowered, not refuted**. Code+tests `ea6c2ea` (gates green). [Results](reviews/btc-fib-selection-learning-results-20260618.md).
2741
- **2026-06-18 Fib SELECTION-LEARNING prominence-baseline sensitivity (4h) → `survives_prominence_family`.** Locked pre-run (A=summed endpoint prominence = `prominence` feature col; B=max endpoint prominence) + locked verdict rule. Same universe/viewport/k/ε/split/model — only baseline rule differs. Model AP-lift robust vs **all three** §6 baselines: magnitude [0.023,0.120], prominence-A +0.043 [0.018,0.104], prominence-B +0.049 [0.021,0.116]; every CI excludes 0, 0/2000 ≤ 0. Sanity: prominence baselines beat magnitude (as expected); model beats both. Weights unchanged → **`cleanliness` still carries the lift** (0.20). So the lead is **not** a magnitude- or prominence-artifact — but still single-feature, low absolute AP (0.057 vs 0.83 ceiling), **not a reproduction**, no edge claim; 1M/1w/1d underpowered. Open: is `cleanliness` a detection/anchoring artifact? [Results](reviews/btc-fib-selection-learning-results-20260618.md).
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# BTC Fib Selection-Learning — MAIN-QUEST RESET / north-star guardrail (2026-06-24)
2+
3+
**Lean Fib Research. Docs-only, no code/run/claim.** A deliberate stop to the mechanics drift and a
4+
re-anchor to the original goal. Binding for the whole selection-learning line.
5+
6+
> **NORTH STAR (Chamoun's original idea — binding):** *Get the machine to learn how the human selects
7+
> meaningful fib legs/ranges and draws Fib like a human analyst, using the human facit as ground
8+
> truth.* **NOT** explaining detector/snapping/measurement geometry detail.
9+
10+
## 1. What we now know that DIRECTLY helps the main quest
11+
12+
- **The human's leg choice is partly learnable (4H).** Human-marked legs are measurably **cleaner /
13+
more efficient**, and a model out-ranks the trivial baselines out-of-sample (Stage-2 lift **+0.052**,
14+
CI excludes 0). → there **is** a real learnable selection signal.
15+
- **It is live-available** (`no_causal_gap`) — a human-like selector would not need hindsight.
16+
- **It is not a detection problem** (Stage-1 recall ~0.90): the human's anchors are already in the
17+
candidate universe; the gap is in **ranking/selecting among candidates** — exactly what a model can
18+
improve.
19+
- **The signal lives in the leg/range gestalt**, not the lone pivot (Stage-1 null) — a human-like
20+
selector must model **legs/ranges**, not individual pivots.
21+
- **But agreement is LOW and thin:** AP ~0.057 vs the ~0.83 reachability ceiling, carried almost
22+
entirely by **one feature** (cleanliness). The model does **not** yet draw like the human — it needs
23+
a **richer representation of "meaningful."** *(This is the gap that defines the real next step.)*
24+
25+
## 2. What was only control / mechanics (rigor, not new capability)
26+
27+
- Prominence-family sensitivity + k-sweep (robustness), W-gap (hindsight control), the cleanliness
28+
artifact-probe (is the lead a detector artifact?), and the mechanics + snapping-flip notes
29+
(detector/snapping geometry). **All were necessary rigor or interesting mechanism — none added model
30+
capability to pick better legs.** The mechanics/flip work is precisely the drift this reset stops.
31+
32+
## 3. Sidetracks to PARK now
33+
34+
- **Artifact / snapping / net-path mechanics** — PARK (questions answered descriptively; does not help
35+
the model pick human-like legs).
36+
- **Matched-null / detector-independent universe** — PARK (gated, its A8 gate was **not** met, high
37+
methodological risk; an artifact-question tool, not a capability-builder).
38+
- **Set-level `exclusivity`** loose end — revisit **only** if it demonstrably improves leg selection.
39+
- **Further detector-geometry explanation** — PARK.
40+
41+
## 4. Next step IF the goal is better human-like leg/range selection
42+
43+
- The **only** directly-aligned move: **enrich the selection model toward the human's actual "meaningful
44+
leg/range" criteria and measure agreement against the facit** (AP toward the 0.83 ceiling). Concretely
45+
— go beyond "cleanest leg" to the multi-component gestalt the prereg already named (scale, pairing,
46+
direction, exclusivity, HTF/context) and test whether **facit-agreement rises** — behind a **blind
47+
design lock** (forking-paths discipline; same two-commit gate as every prior step).
48+
- **Precondition:** a concrete feature/representation hypothesis that *plausibly* raises agreement, AND
49+
enough facit to fit/validate without overfitting (BTC-only, **365** 4h legs, one analyst, ~0.83
50+
ceiling). If that precondition can't be met honestly, see §5.
51+
52+
## 5. If the next step does NOT directly help the model pick better → stop/park
53+
54+
- If we **cannot** specify a richer-feature hypothesis that plausibly raises facit-agreement without
55+
forking-paths, **PARK the modeling line** and return to the **actual main quest**: the human BTC
56+
top-down fib labeling (`1M → 1w → 1d → 4h`,
57+
[protocol](../../BTC_FIRST_TOP_DOWN_FIB_PROTOCOL.md)) — which **is** "draw Fib like a human" and grows
58+
the ground-truth corpus the model would learn from. Modeling resumes only with **more labels** or a
59+
**concrete capability hypothesis**, never as another control/mechanics pass.
60+
61+
## North-star guardrail (BINDING — no drift)
62+
63+
Every future selection-learning step must answer one question first:
64+
65+
> **"Does this improve the model's ability to select human-like fib legs/ranges, measured against the
66+
> facit?"**
67+
68+
If the honest answer is **no** — it is a control, a mechanism explanation, or an artifact-geometry
69+
detail — **do not start it; log it as parked.** Controls and mechanics are **done**. The line either
70+
**advances selection capability** (§4, behind a blind lock) or it **pauses** (§5). No more
71+
detector/snapping/measurement-geometry side-quests.

0 commit comments

Comments
 (0)