Skip to content

Commit 183cb1a

Browse files
张靖恒claude
andcommitted
v4 + multi-window analysis: thesis dead under long-term testing
Post-mortem of the maker-arb thesis after WW archived PR #9. Two methodology fixes in v4: (A) maker fee was wrongly = taker fee in v3. Polymarket docs and live feeSchedule.takerOnly=True on 100/100 sampled markets confirm makers never pay fees. Corrected: maker_fee = 0. (B) v3 was 100% in-sample. Added 10/4 train/test split + multi-window orchestration to detect window-luck. Findings progression: v3 (in-sample, taker fee): -$263/yr naive, +$117 cherry v4 single window (today): +$195/yr naive, +$289 cherry OOS v4 multi-window (4 x 14d = 56d): naive mean -$183 (sign flips!), cherry mean +$251 but UNSTABLE The decisive result: across 4 non-overlapping 14-day windows covering 2026-03-20 to 2026-05-15: - 0 of 64 groups have positive OOS in >=3/4 windows - 44/64 groups (69%) had zero positive OOS across all 4 windows - Even the 2 groups consistently in top-18 by in-sample (Wisconsin, Kansas) had positive OOS in only 2/4 and 1/4 windows respectively - Naive deploy sign flips: -$1,117 in 3/20-4/03 window, +$239 in 4/03-4/17 window Cherry-pick "wins" within each window because we pick this window's winners; but the winners rotate, so no actionable alpha. Files: scripts/simulate_maker_basket_v4.py - corrected fee + IS/OOS split + --end-date for time-shifting scripts/aggregate_v4_multi_window.py - cross-window stability reports/maker-simulation-v4-*-w-*.md - 4 per-window reports reports/maker-simulation-v4-multi-window-2026-05-15.md - the verdict Note: poly_strategy/maker.py production code already has fee_rate_assumption=0.0 for maker legs. The fee bug was localized to my standalone research script, not production. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 64b0902 commit 183cb1a

7 files changed

Lines changed: 1181 additions & 0 deletions
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Maker Simulation v4 — Corrected fee + train/test split (2026-05-15T04:00:51.953413+00:00)
2+
3+
**Method**: v3 plumbing + two fixes:
4+
(A) maker fee mode = `zero` (v3 used taker_rate, which was wrong; Polymarket docs: "makers never pay fees")
5+
(B) 10-day in-sample / 4-day OOS split. Top 18 groups picked by IN-SAMPLE daily $; their OOS sum reported separately.
6+
7+
**Window**: 14 days (2026-04-17 -> 2026-05-15)
8+
**Basket size cap**: $100
9+
**Trades fetched**: 44759 raw -> 2751 qualifying
10+
**Days with trade activity**: in-sample 10, OOS 4
11+
**takerOnly distribution across our markets**: {True: 142}
12+
13+
## Headline (with maker fee = 0)
14+
15+
| Verdict | Daily $ | Annualized |
16+
|---|---:|---:|
17+
| Naive (all 71 groups), in-sample | $+0.63 | $+228 |
18+
| Naive (all 71 groups), OOS | $+0.23 | $+83 |
19+
| Whole window (no split) | $+0.51 | $+187 |
20+
| **Top 18 by in-sample, in-sample** | $+1.73 | $+631 |
21+
| **Top 18 by in-sample, OOS** ← honest verdict | $+0.42 | $+155 |
22+
23+
If top-N OOS << top-N in-sample, the top-N looks like overfitting.
24+
OOS / in-sample ratio for top-18: 0.25
25+
26+
## Top 18 groups — in-sample picked, OOS measured
27+
28+
| Rank | Group | Q | Best markup | IS daily $ | OOS daily $ | OOS/IS |
29+
|---:|---|---|---:|---:|---:|---:|
30+
| 1 | `0x5cddfa5bafea...` | Will the Democrats win the Geo vs Will the | $0.050 | $+0.392 | $+0.000 | +0.00 |
31+
| 2 | `0x8941a4153cb2...` | Will the Democrats win the Ver vs Will the | $0.050 | $+0.225 | $+0.031 | +0.14 |
32+
| 3 | `0x2ecd963d91df...` | Will the Democrats win the Iow vs Will the | $0.050 | $+0.211 | $+0.027 | +0.13 |
33+
| 4 | `0xf5f3857c3391...` | Will the Democrats win the Ohi vs Will the | $0.030 | $+0.179 | $+0.350 | +1.96 |
34+
| 5 | `0x82cc8472987c...` | Will the Democrats win the Kan vs Will the | $0.010 | $+0.157 | $+0.000 | +0.00 |
35+
| 6 | `0x2aa7cf1991dd...` | Will the Democrats win the Kan vs Will the | $0.010 | $+0.103 | $+0.000 | +0.00 |
36+
| 7 | `0xffde13841676...` | Will the Democrats win the Min vs Will the | $0.010 | $+0.098 | $+0.000 | +0.00 |
37+
| 8 | `0x4e43ba407ed4...` | Will the Democrats win the Wis vs Will the | $0.050 | $+0.092 | $+0.000 | +0.00 |
38+
| 9 | `0xd034d33ba5c4...` | Will the Democrats win the Okl vs Will the | $0.020 | $+0.091 | $+0.000 | +0.00 |
39+
| 10 | `0x64111969ce49...` | Will the Democrats win the New vs Will the | $0.020 | $+0.051 | $+0.000 | +0.00 |
40+
| 11 | `0xb61918837517...` | Will the Democrats win the Nor vs Will the | $0.010 | $+0.032 | $+0.000 | +0.00 |
41+
| 12 | `0x1304dee4404b...` | Will the Democrats win the Ari vs Will the | $0.010 | $+0.024 | $+0.000 | +0.00 |
42+
| 13 | `0x8397b62d3e02...` | Will the Democrats win the Neb vs Will the | $0.010 | $+0.020 | $+0.000 | +0.00 |
43+
| 14 | `0xa80fa85f7e10...` | Will the Democrats win the Geo vs Will the | $0.010 | $+0.016 | $+0.016 | +0.98 |
44+
| 15 | `0xb5ba431e070b...` | Will the Democrats win the Con vs Will the | $0.010 | $+0.016 | $+0.000 | +0.00 |
45+
| 16 | `0xac17bb3e2188...` | Will the Democrats win the New vs Will the | $0.010 | $+0.013 | $+0.000 | +0.00 |
46+
| 17 | `0xfbc9abdccc8a...` | Will the Democrats win the Flo vs Will the | $0.030 | $+0.006 | $+0.000 | +0.00 |
47+
| 18 | `0xe5f54ca9f896...` | Will the Democrats win the New vs Will the | $0.010 | $+0.005 | $+0.000 | +0.00 |
48+
49+
## Compared to prior versions
50+
51+
| Version | Method | Annualized | Issue |
52+
|---|---|---:|---|
53+
| v1 (mid-touch) | mid touch as fill proxy | $15,546 | mid touching != fill |
54+
| v2 size-uncapped | sum of all SELL Yes | $918 | income computed at $100/fill regardless of trade size |
55+
| v3 size-capped, taker fee | size cap added | -$263 naive / +$117 cherry-pick | maker fee wrongly = taker fee; no OOS check |
56+
| **v4 this run** | size cap + maker_fee=zero + IS/OOS | $+83 OOS naive / $+155 OOS top-18 | fee per docs; cherry-pick now measured out-of-sample |
57+
58+
## Caveats (still standing)
59+
60+
- Queue priority: assumes we are first in line at our maker price level.
61+
- Per-leg fills assumed independent within a day.
62+
- Maker fee = 0 ignores `rebateRate` (20-25% of pool taker fees redistributed to makers). Real maker income could be modestly HIGHER. Conservative direction.
63+
- 14 days is a short window; the in-sample / OOS split is *one* random partition, not k-fold. Repeat with different splits to test stability.
64+
- Today's bestAsk/bestBid used to compute maker target — historical spread may have differed.
65+
66+
---
67+
*Snapshot: 2026-05-15T04:00:51.953413+00:00*
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Maker Simulation v4 — Corrected fee + train/test split (2026-05-15T04:01:53.391911+00:00)
2+
3+
**Method**: v3 plumbing + two fixes:
4+
(A) maker fee mode = `zero` (v3 used taker_rate, which was wrong; Polymarket docs: "makers never pay fees")
5+
(B) 10-day in-sample / 4-day OOS split. Top 18 groups picked by IN-SAMPLE daily $; their OOS sum reported separately.
6+
7+
**Window**: 14 days (2026-04-03 -> 2026-05-15)
8+
**Basket size cap**: $100
9+
**Trades fetched**: 47508 raw -> 2394 qualifying
10+
**Days with trade activity**: in-sample 10, OOS 4
11+
**takerOnly distribution across our markets**: {True: 139}
12+
13+
## Headline (with maker fee = 0)
14+
15+
| Verdict | Daily $ | Annualized |
16+
|---|---:|---:|
17+
| Naive (all 68 groups), in-sample | $+2.21 | $+806 |
18+
| Naive (all 68 groups), OOS | $+0.66 | $+239 |
19+
| Whole window (no split) | $+1.76 | $+644 |
20+
| **Top 18 by in-sample, in-sample** | $+2.41 | $+879 |
21+
| **Top 18 by in-sample, OOS** ← honest verdict | $+1.32 | $+482 |
22+
23+
If top-N OOS << top-N in-sample, the top-N looks like overfitting.
24+
OOS / in-sample ratio for top-18: 0.55
25+
26+
## Top 18 groups — in-sample picked, OOS measured
27+
28+
| Rank | Group | Q | Best markup | IS daily $ | OOS daily $ | OOS/IS |
29+
|---:|---|---|---:|---:|---:|---:|
30+
| 1 | `0x2aa7cf1991dd...` | Will the Democrats win the Kan vs Will the | $0.050 | $+1.299 | $+0.000 | +0.00 |
31+
| 2 | `0xf5f3857c3391...` | Will the Democrats win the Ohi vs Will the | $0.030 | $+0.280 | $+0.000 | +0.00 |
32+
| 3 | `0x1304dee4404b...` | Will the Democrats win the Ari vs Will the | $0.010 | $+0.261 | $+0.147 | +0.56 |
33+
| 4 | `0x8941a4153cb2...` | Will the Democrats win the Ver vs Will the | $0.050 | $+0.118 | $+0.575 | +4.88 |
34+
| 5 | `0xd4118b02b567...` | Will the Democrats win the Pen vs Will the | $0.010 | $+0.107 | $+0.134 | +1.24 |
35+
| 6 | `0xfbc9abdccc8a...` | Will the Democrats win the Flo vs Will the | $0.030 | $+0.096 | $+0.154 | +1.61 |
36+
| 7 | `0x4e43ba407ed4...` | Will the Democrats win the Wis vs Will the | $0.050 | $+0.063 | $+0.026 | +0.41 |
37+
| 8 | `0x96596827696d...` | Will the Democrats win the Lou vs Will the | $0.020 | $+0.057 | $+0.000 | +0.00 |
38+
| 9 | `0xa80fa85f7e10...` | Will the Democrats win the Geo vs Will the | $0.010 | $+0.034 | $+0.000 | +0.00 |
39+
| 10 | `0x2a010ed53626...` | Will the Democrats win the Flo vs Will the | $0.030 | $+0.013 | $+0.060 | +4.68 |
40+
| 11 | `0x91b62611de4a...` | Will the Democrats win the Mas vs Will the | $0.010 | $+0.011 | $+0.000 | +0.00 |
41+
| 12 | `0xd251f99f27d7...` | Will the Democrats win the New vs Will the | $0.020 | $+0.011 | $+0.000 | +0.00 |
42+
| 13 | `0xdbf0dffb3b5c...` | Will the Democrats win the New vs Will the | $0.005 | $+0.011 | $+0.000 | +0.00 |
43+
| 14 | `0xb61918837517...` | Will the Democrats win the Nor vs Will the | $0.010 | $+0.011 | $+0.000 | +0.00 |
44+
| 15 | `0x287fa3a945e6...` | Will the Democrats win the Vir vs Will the | $0.010 | $+0.011 | $+0.000 | +0.00 |
45+
| 16 | `0xee4444c07438...` | Will the Democrats win the Mis vs Will the | $0.020 | $+0.010 | $+0.000 | +0.00 |
46+
| 17 | `0x50a317c8d911...` | Will the Democrats win the Okl vs Will the | $0.010 | $+0.009 | $+0.000 | +0.00 |
47+
| 18 | `0x64111969ce49...` | Will the Democrats win the New vs Will the | $0.020 | $+0.007 | $+0.224 | +33.33 |
48+
49+
## Compared to prior versions
50+
51+
| Version | Method | Annualized | Issue |
52+
|---|---|---:|---|
53+
| v1 (mid-touch) | mid touch as fill proxy | $15,546 | mid touching != fill |
54+
| v2 size-uncapped | sum of all SELL Yes | $918 | income computed at $100/fill regardless of trade size |
55+
| v3 size-capped, taker fee | size cap added | -$263 naive / +$117 cherry-pick | maker fee wrongly = taker fee; no OOS check |
56+
| **v4 this run** | size cap + maker_fee=zero + IS/OOS | $+239 OOS naive / $+482 OOS top-18 | fee per docs; cherry-pick now measured out-of-sample |
57+
58+
## Caveats (still standing)
59+
60+
- Queue priority: assumes we are first in line at our maker price level.
61+
- Per-leg fills assumed independent within a day.
62+
- Maker fee = 0 ignores `rebateRate` (20-25% of pool taker fees redistributed to makers). Real maker income could be modestly HIGHER. Conservative direction.
63+
- 14 days is a short window; the in-sample / OOS split is *one* random partition, not k-fold. Repeat with different splits to test stability.
64+
- Today's bestAsk/bestBid used to compute maker target — historical spread may have differed.
65+
66+
---
67+
*Snapshot: 2026-05-15T04:01:53.391911+00:00*
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Maker Simulation v4 — Corrected fee + train/test split (2026-05-15T04:03:10.554424+00:00)
2+
3+
**Method**: v3 plumbing + two fixes:
4+
(A) maker fee mode = `zero` (v3 used taker_rate, which was wrong; Polymarket docs: "makers never pay fees")
5+
(B) 10-day in-sample / 4-day OOS split. Top 18 groups picked by IN-SAMPLE daily $; their OOS sum reported separately.
6+
7+
**Window**: 14 days (2026-03-20 -> 2026-05-15)
8+
**Basket size cap**: $100
9+
**Trades fetched**: 48025 raw -> 1331 qualifying
10+
**Days with trade activity**: in-sample 10, OOS 4
11+
**takerOnly distribution across our markets**: {True: 139}
12+
13+
## Headline (with maker fee = 0)
14+
15+
| Verdict | Daily $ | Annualized |
16+
|---|---:|---:|
17+
| Naive (all 68 groups), in-sample | $+0.36 | $+132 |
18+
| Naive (all 68 groups), OOS | $-3.06 | $-1,117 |
19+
| Whole window (no split) | $-0.62 | $-225 |
20+
| **Top 18 by in-sample, in-sample** | $+0.36 | $+132 |
21+
| **Top 18 by in-sample, OOS** ← honest verdict | $+0.40 | $+147 |
22+
23+
If top-N OOS << top-N in-sample, the top-N looks like overfitting.
24+
OOS / in-sample ratio for top-18: 1.12
25+
26+
## Top 18 groups — in-sample picked, OOS measured
27+
28+
| Rank | Group | Q | Best markup | IS daily $ | OOS daily $ | OOS/IS |
29+
|---:|---|---|---:|---:|---:|---:|
30+
| 1 | `0xf5f3857c3391...` | Will the Democrats win the Ohi vs Will the | $0.030 | $+0.126 | $+0.168 | +1.34 |
31+
| 2 | `0x82cc8472987c...` | Will the Democrats win the Kan vs Will the | $0.020 | $+0.036 | $+0.000 | +0.00 |
32+
| 3 | `0x4e43ba407ed4...` | Will the Democrats win the Wis vs Will the | $0.050 | $+0.033 | $+0.133 | +4.08 |
33+
| 4 | `0x64111969ce49...` | Will the Democrats win the New vs Will the | $0.020 | $+0.028 | $+0.000 | +0.00 |
34+
| 5 | `0x2aa7cf1991dd...` | Will the Democrats win the Kan vs Will the | $0.030 | $+0.028 | $+0.017 | +0.62 |
35+
| 6 | `0xcd24472b2d86...` | Will the Democrats win the Col vs Will the | $0.020 | $+0.026 | $+0.000 | +0.00 |
36+
| 7 | `0x5a57c20b2083...` | Will the Democrats win the Ida vs Will the | $0.020 | $+0.025 | $+0.062 | +2.43 |
37+
| 8 | `0xc28de9467003...` | Will the Democrats win the Tex vs Will the | $0.020 | $+0.018 | $+0.000 | +0.00 |
38+
| 9 | `0xd251f99f27d7...` | Will the Democrats win the New vs Will the | $0.030 | $+0.013 | $+0.021 | +1.64 |
39+
| 10 | `0x2a010ed53626...` | Will the Democrats win the Flo vs Will the | $0.030 | $+0.009 | $+0.001 | +0.11 |
40+
| 11 | `0xd4118b02b567...` | Will the Democrats win the Pen vs Will the | $0.020 | $+0.008 | $+0.000 | +0.00 |
41+
| 12 | `0xfbc9abdccc8a...` | Will the Democrats win the Flo vs Will the | $0.030 | $+0.006 | $+0.000 | +0.00 |
42+
| 13 | `0x67d0d210eee8...` | Will the Democrats win the Sou vs Will the | $0.005 | $+0.004 | $+0.000 | +0.00 |
43+
| 14 | `0xac17bb3e2188...` | Will the Democrats win the New vs Will the | $0.005 | $+0.002 | $+0.002 | +1.07 |
44+
| 15 | `0xa80fa85f7e10...` | Will the Democrats win the Geo vs Will the | $0.010 | $+0.000 | $+0.000 | +0.00 |
45+
| 16 | `0x22725f09e6a3...` | Will the Democratic Party cont vs Will the | $0.005 | $+0.000 | $+0.000 | +0.00 |
46+
| 17 | `0xd4ec843b5228...` | Will the Democratic Party cont vs Will the | $0.005 | $+0.000 | $+0.000 | +0.00 |
47+
| 18 | `0xdc4bd1724b69...` | Will the Republicans win the 2 vs Will the | $0.005 | $+0.000 | $+0.000 | +0.00 |
48+
49+
## Compared to prior versions
50+
51+
| Version | Method | Annualized | Issue |
52+
|---|---|---:|---|
53+
| v1 (mid-touch) | mid touch as fill proxy | $15,546 | mid touching != fill |
54+
| v2 size-uncapped | sum of all SELL Yes | $918 | income computed at $100/fill regardless of trade size |
55+
| v3 size-capped, taker fee | size cap added | -$263 naive / +$117 cherry-pick | maker fee wrongly = taker fee; no OOS check |
56+
| **v4 this run** | size cap + maker_fee=zero + IS/OOS | $-1,117 OOS naive / $+147 OOS top-18 | fee per docs; cherry-pick now measured out-of-sample |
57+
58+
## Caveats (still standing)
59+
60+
- Queue priority: assumes we are first in line at our maker price level.
61+
- Per-leg fills assumed independent within a day.
62+
- Maker fee = 0 ignores `rebateRate` (20-25% of pool taker fees redistributed to makers). Real maker income could be modestly HIGHER. Conservative direction.
63+
- 14 days is a short window; the in-sample / OOS split is *one* random partition, not k-fold. Repeat with different splits to test stability.
64+
- Today's bestAsk/bestBid used to compute maker target — historical spread may have differed.
65+
66+
---
67+
*Snapshot: 2026-05-15T04:03:10.554424+00:00*
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Maker Simulation v4 — Corrected fee + train/test split (2026-05-15T04:04:02.459101+00:00)
2+
3+
**Method**: v3 plumbing + two fixes:
4+
(A) maker fee mode = `zero` (v3 used taker_rate, which was wrong; Polymarket docs: "makers never pay fees")
5+
(B) 10-day in-sample / 4-day OOS split. Top 18 groups picked by IN-SAMPLE daily $; their OOS sum reported separately.
6+
7+
**Window**: 14 days (2026-05-01 -> 2026-05-15)
8+
**Basket size cap**: $100
9+
**Trades fetched**: 38341 raw -> 1355 qualifying
10+
**Days with trade activity**: in-sample 11, OOS 4
11+
**takerOnly distribution across our markets**: {True: 140}
12+
13+
## Headline (with maker fee = 0)
14+
15+
| Verdict | Daily $ | Annualized |
16+
|---|---:|---:|
17+
| Naive (all 69 groups), in-sample | $+0.45 | $+164 |
18+
| Naive (all 69 groups), OOS | $+0.18 | $+65 |
19+
| Whole window (no split) | $+0.38 | $+138 |
20+
| **Top 18 by in-sample, in-sample** | $+0.56 | $+205 |
21+
| **Top 18 by in-sample, OOS** ← honest verdict | $+0.60 | $+218 |
22+
23+
If top-N OOS << top-N in-sample, the top-N looks like overfitting.
24+
OOS / in-sample ratio for top-18: 1.06
25+
26+
## Top 18 groups — in-sample picked, OOS measured
27+
28+
| Rank | Group | Q | Best markup | IS daily $ | OOS daily $ | OOS/IS |
29+
|---:|---|---|---:|---:|---:|---:|
30+
| 1 | `0xd4ec843b5228...` | Will the Democratic Party cont vs Will the | $0.010 | $+0.250 | $+0.487 | +1.95 |
31+
| 2 | `0xb61918837517...` | Will the Democrats win the Nor vs Will the | $0.010 | $+0.073 | $+0.000 | +0.00 |
32+
| 3 | `0x2aa7cf1991dd...` | Will the Democrats win the Kan vs Will the | $0.030 | $+0.064 | $+0.000 | +0.00 |
33+
| 4 | `0x7bd878bdc3cd...` | Will the Democrats win the Nev vs Will the | $0.050 | $+0.035 | $+0.000 | +0.00 |
34+
| 5 | `0x2ecd963d91df...` | Will the Democrats win the Iow vs Will the | $0.050 | $+0.030 | $+0.000 | +0.00 |
35+
| 6 | `0x7146f4aff656...` | Will the Democrats win the Ill vs Will the | $0.050 | $+0.023 | $+0.000 | +0.00 |
36+
| 7 | `0x82cc8472987c...` | Will the Democrats win the Kan vs Will the | $0.010 | $+0.014 | $+0.000 | +0.00 |
37+
| 8 | `0x209eca0d8c37...` | Will the Democrats win the Wyo vs Will the | $0.050 | $+0.014 | $+0.000 | +0.00 |
38+
| 9 | `0x2a010ed53626...` | Will the Democrats win the Flo vs Will the | $0.050 | $+0.013 | $+0.000 | +0.00 |
39+
| 10 | `0xc28de9467003...` | Will the Democrats win the Tex vs Will the | $0.020 | $+0.010 | $+0.008 | +0.86 |
40+
| 11 | `0x4e43ba407ed4...` | Will the Democrats win the Wis vs Will the | $0.050 | $+0.009 | $+0.000 | +0.00 |
41+
| 12 | `0x8941a4153cb2...` | Will the Democrats win the Ver vs Will the | $0.050 | $+0.009 | $+0.000 | +0.00 |
42+
| 13 | `0x1304dee4404b...` | Will the Democrats win the Ari vs Will the | $0.010 | $+0.008 | $+0.000 | +0.00 |
43+
| 14 | `0x195e8f642b07...` | Will the Democrats win the Tex vs Will the | $0.050 | $+0.005 | $+0.103 | +21.53 |
44+
| 15 | `0x284bd4583b40...` | Will the Democrats win the Ore vs Will the | $0.010 | $+0.004 | $+0.000 | +0.00 |
45+
| 16 | `0xac17bb3e2188...` | Will the Democrats win the New vs Will the | $0.030 | $+0.003 | $+0.000 | +0.00 |
46+
| 17 | `0xdc4bd1724b69...` | Will the Republicans win the 2 vs Will the | $0.005 | $+0.000 | $+0.000 | +0.00 |
47+
| 18 | `0x07311e10dac6...` | Will the Democrats win the Ala vs Will the | $0.005 | $+0.000 | $+0.000 | +0.00 |
48+
49+
## Compared to prior versions
50+
51+
| Version | Method | Annualized | Issue |
52+
|---|---|---:|---|
53+
| v1 (mid-touch) | mid touch as fill proxy | $15,546 | mid touching != fill |
54+
| v2 size-uncapped | sum of all SELL Yes | $918 | income computed at $100/fill regardless of trade size |
55+
| v3 size-capped, taker fee | size cap added | -$263 naive / +$117 cherry-pick | maker fee wrongly = taker fee; no OOS check |
56+
| **v4 this run** | size cap + maker_fee=zero + IS/OOS | $+65 OOS naive / $+218 OOS top-18 | fee per docs; cherry-pick now measured out-of-sample |
57+
58+
## Caveats (still standing)
59+
60+
- Queue priority: assumes we are first in line at our maker price level.
61+
- Per-leg fills assumed independent within a day.
62+
- Maker fee = 0 ignores `rebateRate` (20-25% of pool taker fees redistributed to makers). Real maker income could be modestly HIGHER. Conservative direction.
63+
- 14 days is a short window; the in-sample / OOS split is *one* random partition, not k-fold. Repeat with different splits to test stability.
64+
- Today's bestAsk/bestBid used to compute maker target — historical spread may have differed.
65+
66+
---
67+
*Snapshot: 2026-05-15T04:04:02.459101+00:00*
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Maker Simulation v4 — Multi-Window Stability (2026-05-15T04:05:23.269350+00:00)
2+
3+
**Method**: ran v4 (maker_fee=zero, 10 IS / 4 OOS) on 4 non-overlapping 14-day windows. Total span: 2026-03-20 -> 2026-05-15 (56 days).
4+
5+
**Why this matters**: a single 14-day window's verdict can be window-luck. If the thesis is real, all 4 windows should give roughly consistent signs and magnitudes. If they bounce sign or order of magnitude, the single-window verdict was a coincidence.
6+
7+
## Per-window numbers
8+
9+
| Window | IS days | OOS days | Naive IS / yr | Naive OOS / yr | Cherry IS / yr | Cherry OOS / yr |
10+
|---|---:|---:|---:|---:|---:|---:|
11+
| 2026-03-20 → 2026-04-03 | 10 | 4 | $+132 | $-1,117 | $+132 | $+147 |
12+
| 2026-04-03 → 2026-04-17 | 10 | 4 | $+806 | $+239 | $+879 | $+482 |
13+
| 2026-04-17 → 2026-05-01 | 10 | 4 | $+228 | $+83 | $+631 | $+155 |
14+
| 2026-05-01 → 2026-05-15 | 10 | 4 | $+164 | $+65 | $+205 | $+218 |
15+
16+
## Cross-window stability
17+
18+
| Metric | mean | median | min | max | SD | SD/mean |
19+
|---|---:|---:|---:|---:|---:|---:|
20+
| Naive IS / yr | $+333 | $+196 | $+132 | $+806 | $318 | 0.96 |
21+
| Naive OOS / yr | $-183 | $+74 | $-1,117 | $+239 | $628 | 3.44 |
22+
| Cherry IS / yr | $+462 | $+418 | $+132 | $+879 | $355 | 0.77 |
23+
| Cherry OOS / yr | $+251 | $+187 | $+147 | $+482 | $157 | 0.63 |
24+
25+
**Read this**: if SD/mean > 1.0, your point estimate is mostly noise. If sign of OOS is consistent across windows but magnitude varies 2-3x, you have a real but noisy signal.
26+
27+
## Persistent winners (positive OOS in ≥3 of 4 windows)
28+
29+
Found 0 groups (of 64 present in all windows). Sum of their mean OOS = **$+0/yr**.
30+
31+
| Rank | Group | Q | +OOS windows | Mean OOS / yr | Median OOS / yr | Values per window |
32+
|---:|---|---|:-:|---:|---:|---|
33+
34+
## Interpretation
35+
36+
- The single-window verdict from any one run alone is statistically weak.
37+
- The honest verdict is the mean OOS across all windows.
38+
- The set of **persistent winners** (+OOS in ≥3/4 windows) gives the most defensible cherry-pick.
39+
- If `naive_oos` flips sign across windows, the thesis applies only to a subset of groups, not to a naive deploy.
40+
41+
---
42+
*Snapshot: 2026-05-15T04:05:23.269350+00:00*

0 commit comments

Comments
 (0)