Skip to content

Commit b533385

Browse files
JohnCCarterclaude
andcommitted
docs(research): lock cleanliness artifact-probe (cheap-first track B, blind Commit-1)
Blind lock for the open campaign crux: is the Stage-2 cleanliness lead a genuine human leg-selection signal or a detector/anchoring artifact? Cheap-first scope of track B (after a design-only feasibility check) — tested on EXISTING facit data, NO new candidate universe. Locked blind: cleanliness formula (source-bound, _cleanliness), reached/unreached = Stage-2 eps-reconstruction over ALL 365 4h legs (unreached are the signal, not filtered), exact-vs-snapped paired contrast (index-span only; no imputation), quarter-block bootstrap (detector-free, seed 20260618), power floor, verdict rules (surfacing + snapping -> artifact_risk_reduced vs detector_artifact_supported), and the matched-null gate-rule (NOT built; gated optional rung behind its own separate blind lock). Docs-only: no code, no run, no build, no matched-null, no push. Diagnostic, not a headline; artifact_risk_reduced != "cleanliness proven human intuition"; no reproduction/edge/behaviour/Genesis claim. Commit 2 needs a separate GO. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 9c40e64 commit b533385

2 files changed

Lines changed: 195 additions & 0 deletions

File tree

docs/research_wiki/handoff.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,20 @@ legs/ranges* (labels = facit; **no edge/behaviour/backtest/PnL/Genesis/auto-fib
133133
single open CRUX (is `cleanliness` a genuine signal or a **detector/anchoring artifact**?). Frames
134134
the next-step choice A (exclusivity / artifact diagnostic) / B (detector-independent anchor-probe) /
135135
C (pause + theory) — **none started**. [Checkpoint](reviews/btc-fib-selection-learning-checkpoint-20260624.md).
136+
- **2026-06-24 Fib SELECTION-LEARNING `cleanliness` artifact-probe — LOCKED (Commit 1, docs-only),
137+
RUN PENDING separate GO.** Cheap-first scope of track B (chosen after a design-only feasibility
138+
check): tests the open crux — is the Stage-2 `cleanliness` lead genuine or a detector/anchoring
139+
artifact — on **existing facit data, no new candidate universe**. Two contrasts, blind-locked in the
140+
[artifact LOCK](reviews/btc-fib-selection-learning-artifact-lock-20260624.md): (1) **surfacing** =
141+
reached-vs-unreached human-leg cleanliness (ALL 365 4h legs, exact anchors, Stage-2 ε-reconstruction
142+
split — unreached are the signal, not filtered); (2) **snapping** = exact-vs-snapped paired contrast.
143+
Quarter-block bootstrap, verdict {`detector_surfacing_artifact` / `no_surfacing_artifact` /
144+
`snapping_inflates_cleanliness` / `no_snapping_inflation``artifact_risk_reduced` vs
145+
`detector_artifact_supported`}. **Matched-null / new universe NOT built** — gated optional rung,
146+
only if surfacing artifact is found AND needs quantifying, and then only behind its own separate
147+
blind lock (A8). **Diagnostic; `artifact_risk_reduced` ≠ "cleanliness proven human intuition"; no
148+
reproduction/edge/behaviour claim.** Commit 2 (build+run) needs a separate GO and new module
149+
`selection_learning_artifact.py`.
136150

137151
**Next work requires a separate explicit GO. No W/gap, no Stage 1, no new sensitivity, and no Genesis
138152
may be started automatically.** Parked (test-only, separate GO): lock the facit-discipline refusal
Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
# BTC Fib Selection-Learning — `cleanliness` artifact-probe LOCK (2026-06-24)
2+
3+
**DOCS-ONLY. Authorises no code, no run, no build, no dependency, no matched-null, no new candidate
4+
universe, no label/corpus change, no push.** This is the **blind Commit-1 lock** for the
5+
**cleanliness artifact-probe** — the cheap-first, existing-data diagnostic that interrogates the
6+
single open CRUX named in the
7+
[campaign checkpoint](btc-fib-selection-learning-checkpoint-20260624.md). It is **not** a new prereg
8+
line: it tests an **already-produced** result (the Stage-2 `cleanliness` lead) for a measurement
9+
defect, blind to any artifact-probe output. Execution needs a **separate explicit GO** (Commit 2).
10+
11+
**Blindness attestation:** no artifact-probe harness exists; **no reached/unreached cleanliness mean,
12+
no snapping gap, no CI has ever been computed or seen.** Every rule below is fixed from the campaign
13+
locks, the frozen config, and existing code — not from any artifact-probe result.
14+
15+
## A0. Question + role (binding framing)
16+
17+
> **Is the Stage-2 `cleanliness` lead a genuine human leg-selection signal, or a detection / anchoring
18+
> artifact?**
19+
20+
The campaign established (checkpoint) that the modest 4h selection lead is carried almost entirely by
21+
leg `cleanliness`, lives in the leg gestalt, is live-available (`no_causal_gap`) and not a
22+
coverage/pivot problem. The **one open crux** is whether `cleanliness` is partly **mechanical**
23+
because the whole pipeline is conditioned on the detector's pivot universe and `cleanliness` is
24+
computed on detector-defined legs. This probe decomposes that crux into its **two** mechanisms and
25+
tests **both on existing facit data, with no new candidate universe**:
26+
27+
1. **Surfacing-bias** — does the detector preferentially *surface* cleaner human legs?
28+
2. **Snapping-bias** — does *snapping* a human anchor to the nearest detector pivot mechanically
29+
*raise* the measured `cleanliness`?
30+
31+
This is a **diagnostic, not a headline**; it adds **no positive claim** (A9).
32+
33+
## A1. `cleanliness` formula (locked — source-bound, NOT redefined)
34+
35+
The engine's existing feature, used verbatim by Stage-2
36+
([`core/features.py::_cleanliness`](../../../src/fibengine/core/features.py)):
37+
38+
```
39+
cleanliness(span [lo,hi]) = |close[hi] − close[lo]| / Σ_{i∈(lo,hi]} |close[i] − close[i−1]|
40+
```
41+
42+
(net close-move ÷ total close-path; `1.0` for `<2` bars or zero path). **Key structural fact, locked:
43+
`cleanliness` depends ONLY on the endpoints' bar-index span `[lo,hi]` (close prices over that span) —
44+
NOT on the anchor's click-price.** Therefore:
45+
46+
- The exact-vs-snapped contrast (A3) is a **pure index-span** contrast (human bar indices vs detector
47+
pivot bar indices).
48+
- Because every leg's span `[lo,hi] ⊆` (… `anchor_b`], i.e. fully **before** the decision point,
49+
`cleanliness` is **inherently causal** — the `k=3` truncation (A4) cannot change it. No hindsight is
50+
introduced by either contrast. This lock does **not** redefine the formula; it pins which one is used.
51+
52+
## A2. reached / unreached definition (locked — Stage-2 ε-reconstruction, ALL legs)
53+
54+
- **Corpus = ALL human 4h legs** (the 365 `fib_*.json` source legs; facit-discipline, human-only
55+
sidecars via [`load_human_legs`](../../../src/fibengine/research/selection_learning.py)). **Unreached
56+
legs are NOT filtered out — they are the signal** (A0.1).
57+
- A leg is **reached** iff **BOTH** its anchors are ε-reconstructable by the detector under the **exact
58+
Stage-2 rule**: each human anchor has a **causally detected** pivot of matching kind within **ε**
59+
(`time_tol = 3` bars, `price_tol = 0.5` × causal ATR; A4 of the campaign), where detection is
60+
[`detect_pivots`](../../../src/fibengine/pivots/detect.py) on the frame **truncated at `anchor_b +
61+
k`** with the frozen config (`fractal_n=1, lookback=3, min_prominence_atr=0.5`), `k = 3` (the Stage-2
62+
headline cell). **Unreached** = at least one anchor not ε-reconstructable. This reproduces the
63+
Stage-2 ~0.83 leg-reachability split (expected ≈ 62 unreached on 4h).
64+
- **Primary = causal `k=3` detection** (parity with the pipeline that produced the lead). Full-frame
65+
detection is a **named sensitivity only**, not the verdict basis.
66+
67+
## A3. exact-vs-snapped definition (locked — paired, reached legs only, no imputation)
68+
69+
For each **reached** leg:
70+
71+
- **exact-anchor cleanliness** = `cleanliness` over the span `[idx(anchor_a), idx(anchor_b)]`, where
72+
`idx(·)` is the human anchor's bar index ([`_pos_of_ts`](../../../src/fibengine/research/selection_learning.py)).
73+
- **snapped-anchor cleanliness** = `cleanliness` over `[idx(piv_a), idx(piv_b)]`, where `piv_·` is the
74+
**ε-matched detector pivot nearest** to each human anchor (tie-break: smallest time distance, then
75+
smallest price distance; ties logged).
76+
- **Contrast = `gap_snap = snapped − exact`** (paired, within-leg).
77+
- **No imputation.** Unreached legs have no snapped endpoints → **excluded from THIS contrast only**
78+
(they remain in the A2 surfacing contrast). Any reached leg whose snap is ambiguous/degenerate
79+
(e.g. `piv_a == piv_b`) is **dropped and logged**, never imputed.
80+
81+
## A4. causal computation (locked)
82+
83+
`cleanliness` and detection both computed on the frame **truncated at `anchor_b + k`, `k = 3`**, ATR
84+
**causal** (trailing Wilder to the decision point). Per A1, truncation is moot for `cleanliness`
85+
itself (span is pre-`anchor_b`); the lock keeps full parity with Stage-2/W-gap and forbids any
86+
full-series leakage in the **detection** step (A2).
87+
88+
## A5. Statistic + bootstrap unit (locked — NOT row-level)
89+
90+
- **Surfacing statistic:** `gap_surface = mean(cleanliness | reached) − mean(cleanliness | unreached)`.
91+
- **Snapping statistic:** `gap_snap = mean(snapped − exact)` over reached legs (paired).
92+
- **Bootstrap = block bootstrap by CALENDAR QUARTER of `anchor_b`** (detector-free, exogenous unit —
93+
the A3 structural-chunk used detector pivots and does **not** transfer to a detector-free probe).
94+
Resample whole quarters with replacement (each quarter carries its reached + unreached legs),
95+
recompute the statistic, **2000 resamples, seed `20260618`**. Report point estimate, 95% CI, and
96+
one-sided `p`. **Row-level bootstrap is explicitly rejected** (legs cluster by regime/quarter).
97+
Month-block is a named sensitivity, not the primary.
98+
99+
## A6. Power floor (locked)
100+
101+
- **Surfacing contrast powered** iff `min(n_reached, n_unreached) ≥ 10` **and** ≥ 3 distinct quarters
102+
contain an unreached leg (so the block bootstrap is non-degenerate).
103+
- **Snapping contrast powered** iff `n_reached ≥ 10`.
104+
- **Expected powered: 4h only.** 1M/1w/1d are **context if underpowered, never refuted** (too few
105+
unreached legs).
106+
107+
## A7. Verdict rules (pre-stated, falsifiable — 4h primary; applied verbatim)
108+
109+
**Surfacing (`gap_surface`, 95% CI):**
110+
- **`detector_surfacing_artifact`** — CI **excludes 0 ABOVE** (reached significantly cleaner): the
111+
detector preferentially surfaces cleaner human legs → the lead is **partly** a surfacing artifact.
112+
- **`no_surfacing_artifact`** — CI **includes 0**: no evidence the detector surfaces cleaner human legs
113+
→ surfacing artifact **not supported** (artifact risk reduced on this axis).
114+
- **`inverse_surfacing`** (direction guard) — CI **excludes 0 BELOW** (unreached cleaner): unexpected;
115+
**investigate, not a finding.**
116+
117+
**Snapping (`gap_snap`, 95% CI):**
118+
- **`snapping_inflates_cleanliness`** — CI **excludes 0 ABOVE**: snapping to detector endpoints
119+
mechanically raises `cleanliness` → measurement-bias artifact present.
120+
- **`no_snapping_inflation`** — CI **includes 0**: snapping does not inflate `cleanliness`.
121+
- **`snapping_deflates`** (direction guard) — CI **excludes 0 BELOW**: **investigate, not a finding.**
122+
123+
**Combined artifact reading (locked):**
124+
- `no_surfacing_artifact` **AND** `no_snapping_inflation`**`artifact_risk_reduced`** — the strongest
125+
non-artifact evidence the cheap probe can give: the `cleanliness` lead is **not explained** by
126+
detector surfacing or snapping. **This is NOT "cleanliness is proven human intuition"** (A9).
127+
- **Either** contrast fires its artifact branch → **`detector_artifact_supported`** — the lead is
128+
**partly mechanical**; report **which half** and by how much.
129+
- Underpowered → **`inconclusive_underpowered`** (checked first).
130+
131+
## A8. Gate-rule for the matched-null / new candidate universe (locked)
132+
133+
The matched-null (detector-independent leg universe) **may be considered ONLY IF** the cheap probe
134+
returns **`detector_artifact_supported`** (surfacing artifact present) **AND** quantifying the residual
135+
is judged necessary. If the probe returns **`artifact_risk_reduced`**, the matched-null is
136+
**UNJUSTIFIED** — the crux is resolved on the cheap axis and the expensive universe is scope-creep.
137+
**No matched-null, and no new candidate universe, may be built under this lock.** Any future
138+
matched-null requires its **own separate blind lock** (own design-check, own prereg).
139+
140+
## A9. Non-claims (binding)
141+
142+
- **Not a reproduction** of human selection. **Not** an edge / behaviour / PnL / backtest / strategy
143+
claim. No Genesis, no auto-fib-as-truth.
144+
- **`artifact_risk_reduced` does NOT prove `cleanliness` is "human intuition."** It only narrows the
145+
artifact risk on **two specific mechanisms** (surfacing, snapping). The broader "is human-leg
146+
cleanliness special vs any matched non-human swing" question is **out of scope** (gated, A8).
147+
- Underpowered TFs are **context, not refuted.**
148+
- **No 1H, no ETH, no label/corpus mutation, no `data.fetch --refresh`** (frozen-data parity — same
149+
universe as Stage-2 / W-gap / Stage-1).
150+
151+
## A10. Implementation plan (Commit 2 — NOT executed here)
152+
153+
- **New module `src/fibengine/research/selection_learning_artifact.py`** with its **own CLI entry**;
154+
**no code added to `selection_learning.py`** (byte-capped). Reuse `_cleanliness` (or its exact
155+
formula), `load_human_legs`, `_pos_of_ts`, `detect_pivots`, `atr`, `load_candles`, the ε constants,
156+
and the `FROZEN_SNAPSHOT` preflight pattern from `selection_learning_gap.py`.
157+
- **Tests** `tests/research/test_selection_learning_artifact.py` (reached/unreached split, exact-vs-
158+
snapped span, quarter-block bootstrap, verdict branches, no-imputation drop/log, k=3 causal parity).
159+
- **Results doc** later (`btc-fib-selection-learning-artifact-results-YYYYMMDD.md`, Observed / Inferred
160+
/ Unverified). Artifacts under `experiments/review/fib_selection_learning/artifact/` (**gitignored**).
161+
- **Preflight FIRST**, frozen-data parity, per-cell/contrast checkpoint as needed.
162+
163+
## A11. Why this answers the crux better than a new candidate universe
164+
165+
- **Symmetric, not conservative.** `reached-vs-unreached` is informative in **both** directions
166+
(≈ → no surfacing artifact; ≫ → artifact), where a matched-null is **asymmetric** (fail-to-reject is
167+
inconclusive, not non-artifact) because its endogenous swing-validity rule correlates with
168+
`cleanliness` by construction and raises the null baseline.
169+
- **Detector-independent MEASUREMENT on existing data.** It uses **exact human anchors** (facit) — no
170+
new universe, no swing-filter, no `K`, no draw-pool, no arbitrary windowing (the convenience trap the
171+
B feasibility check flagged).
172+
- It targets the **exact** two mechanisms in "detection / anchoring artifact": `reached-vs-unreached`
173+
**is** the surfacing test; `exact-vs-snapped` **is** the anchoring/measurement test.
174+
175+
## A12. What this doc does NOT do
176+
177+
No code, no harness, no build, no run, no dependency, no matched-null, no new candidate universe, no
178+
label/corpus mutation, no push. Does **not** grant execution — Commit 2 requires a **separate explicit
179+
GO**, and must **halt and report before code** if any of {`cleanliness` formula, reached/unreached
180+
definition, exact-vs-snapped definition, bootstrap unit, power floor, verdict rules, matched-null
181+
gate-rule} is found unclear at build time.

0 commit comments

Comments
 (0)