You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(research): update campaign checkpoint with cleanliness artifact-probe result
Adds verdict-chain row 6 (artifact-probe, 1573b56) and folds the result into the
CRUX section: the inflationary detector-artifact got NO support on the 4h primary
(both contrasts point against — inverse_surfacing + snapping_deflates), but this is
NOT artifact_risk_reduced (both CIs exclude 0) and the snapping reversal does not
replicate (flips to inflation on 1d) → "investigate, not a finding". Crux stays OPEN,
now narrowed (inflationary mechanism unsupported) with a sharper investigate-target.
Reframes the next-step section: cheap-first B is DONE; the matched-null stays
UNJUSTIFIED (A8 gate not met) and would need its own blind lock. No track recommended,
no new track started. Consolidation only — no new positive claim, no lock change, no
matched-null, no new universe, no Genesis/1H/ETH, holds all non-claims.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
| 6 |`cleanliness` artifact-probe (06-24) | direction guards **`inverse_surfacing`** + **`snapping_deflates`** (A7-unregistered combined → `meta:` status, NOT a verdict) | surfacing gap **−0.0557** CI [−0.1150, −0.00095]; snapping gap **−0.0219** CI [−0.0320, −0.0102]; both exclude 0 **below**|`1573b56`|
26
32
27
33
## What we KNOW (positive, scoped to 4h + the frozen eight features)
28
34
@@ -52,32 +58,49 @@ explicit GO.
52
58
The entire pipeline is **conditioned on the detector's pivot universe**, and `cleanliness` is computed
53
59
on **detector-defined legs**. If the detector preferentially surfaces or anchors clean/efficient legs,
54
60
then "human legs are cleaner" is partly **mechanical** — baked in by candidate generation, not a fact
55
-
about human choice. This question has been flagged at **every** stage (06-18 sensitivity, k-sweep,
56
-
W-gap, Stage-1) and **deliberately left OPEN**; nothing run so far can resolve it, because every run
57
-
lives inside the same detector frame.
61
+
about human choice.
62
+
63
+
**What the artifact-probe settled, and what it did not** ([results](btc-fib-selection-learning-artifact-results-20260624.md),
64
+
LOCK `b533385`): the cheap-first probe tested the two mechanisms of the *inflationary* version of this
65
+
artifact on existing facit data. On the 4h primary (both contrasts powered, fidelity OK reached 0.860):
66
+
67
+
-**Surfacing** — reached legs are *less* clean than unreached (gap −0.0557, CI excludes 0 **below**) →
68
+
guard `inverse_surfacing`. The detector does **not** surface cleaner human legs — if anything the
69
+
reverse (marginal: CI upper −0.00095).
70
+
-**Snapping** — snapping anchors to detector pivots *lowers* cleanliness (gap −0.0219, CI excludes 0
71
+
**below**) → guard `snapping_deflates`. Snapping does **not** inflate it.
72
+
73
+
So the simple **"the detector inflates cleanliness"** story gets **no support** on the powered primary.
74
+
**But the crux stays OPEN:** (1) this is **not**`artifact_risk_reduced` (both CIs *exclude* 0, the
75
+
direction guards, not include); (2) the surfacing reversal is **marginal** and the snapping reversal
76
+
**does not replicate** (it flips to *inflation* on the 1d context cell, +0.0222) → TF-dependent,
77
+
**"investigate, not a finding"**; (3) the broader "is `cleanliness` special vs a matched non-human
78
+
swing" question is **out of scope** (matched-null gated, A8 — **not built**; its gate condition
79
+
`detector_artifact_supported` on the primary was **not** met, so it stays unjustified). The crux is now
80
+
**narrowed** (the inflationary mechanism is unsupported) with a sharper investigate-target: *why
81
+
reached/snapped legs are less clean, and why snapping flips sign by TF.*
58
82
59
83
**Secondary loose end:** set-level **`exclusivity`** (`k*=3`) was specced in the
60
84
[§12 addendum](btc-fib-selection-learning-addendum-20260618.md) but the Stage-2 live whitelist actually
61
85
built was `{magnitude, cleanliness, duration, prominence, structure_alignment}` — **exclusivity was
62
86
never implemented or run.** It is an unfinished feature, not a result.
63
87
64
-
## The next-step choice — A / B / C (all roads lead to the crux)
65
-
66
-
|| Track | What it does | Decides the crux? | Main risk |
|**A**| set-level exclusivity / `cleanliness`-artifact diagnostic | stays in the detector frame: (i) build the unbuilt `exclusivity` feature, does `cleanliness` survive its inclusion; (ii) a diagnostic separating "human prefers clean legs" from "detector surfaces clean legs" |**partly**| within-detector-frame circularity may be **irreducible** — a within-frame test may not fully break the circle |
69
-
|**B**| detector-independent anchor-probe | removes the detector as candidate-generator and tests whether the anchor / `cleanliness` signal survives **without** the suspected source of circularity |**most directly**| requires **inventing a detector-independent frame** → new arbitrary choices (validity-over-convenience hazard); MUST be locked blind |
70
-
|**C**| pause + write "current theory of human fib selection" | synthesis-only; consolidates KNOW/crux into a stated theory, defers the empirical crux |**no** (defers) | leaves the artifact question open; risks theorizing past the evidence |
71
-
72
-
-**A and B are two angles on the SAME crux** (cleanliness-as-artifact / detector-circularity); **C
73
-
defers it.** Whichever empirical track is chosen, its verdict rule must be **locked blind before any
74
-
build** (two-commit gate, as with W-gap and Stage-1), and must respect validity-over-convenience —
75
-
no quietly-chosen control or frame.
76
-
-**Tentative lean (not a decision — the GO is the user's next turn):** the crux *is* the artifact
77
-
question, and only **B** attacks its suspected source head-on; **A**'s within-frame test may not
78
-
break the circularity and **C** defers it. So **B, but only behind a tight blind lock** — and if a
79
-
detector-independent frame cannot be defined without arbitrary convenience choices, **fall back to
80
-
A**'s narrow cleanliness-artifact diagnostic. Recommend: **this checkpoint first, then A or B.**
88
+
## The next-step choice — where it stands after the artifact-probe
89
+
90
+
The original A/B/C framing has partly resolved: **cheap-first B (the existing-data `cleanliness`
91
+
artifact-probe) is DONE** (row 6). The remaining candidate doors — **none started, none authorised:**
92
+
93
+
|| Track | Status / what it would do | Main risk |
|**(i)**|**investigate** the artifact-probe's own finding | on existing data: why reached/snapped legs are *less* clean, and why snapping flips sign 4h↔1d (mechanical hypothesis: detector reconstructs larger/longer swings; snapping extends spans → more path) | low — descriptive, detector-frame |
96
+
|**(ii)**| gated **matched-null** / detector-independent universe |**stays UNJUSTIFIED** — its A8 gate (`detector_artifact_supported` on the primary) was **not** met; would need its **own** separate blind lock and risks inventing an arbitrary frame | high (validity-over-convenience) |
97
+
|**A′**| set-level **`exclusivity`** feature (the unbuilt loose end) | build the specced-but-unimplemented feature; orthogonal to the artifact crux | low–medium |
98
+
|**C**| pause + write **"current theory of human fib selection"**| synthesis-only; consolidate KNOW + the now-narrowed crux; defers further empirics | risks theorizing past the evidence |
99
+
100
+
- Any empirical track must be **locked blind before any build** (two-commit gate, as with W-gap /
101
+
Stage-1 / the artifact-probe) and respect validity-over-convenience — no quietly-chosen control or
102
+
frame. **The matched-null specifically may not be built without meeting its A8 gate AND a new lock.**
103
+
-**No track is recommended here** — this is consolidation. The GO is the user's next turn.
81
104
82
105
## Non-claims (binding — carried from every prior lock)
0 commit comments