Skip to content

Commit 79077aa

Browse files
张靖恒claude
andcommitted
Run gamma distribution experiment + add openrouter calibration script
Experiment 1+3 (Gamma distribution + structural ground truth): - Pulled n=2000 active markets from gamma-api.polymarket.com (4 pages × 500, ~3.5s total) - vol24hr P10/P50/P90 = $0 / $40 / $18,333 - Liquidity P10/P50/P90 = $787 / $10,138 / $221,690 - Spread P10/P50/P90 = 0.001 / 0.01 / 0.10 (present in raw Gamma — PR #4 spec was wrong to exclude this) - 14-90d-to-resolution band = 693 markets (35%) — target range OK - Derived 10,122 mutex pairs from 171 neg-risk groups → T4 $0 corpus validated Implications: - Q1 thresholds in PR #3 are way too high; data-driven values in report - Q4 T4 corpus problem disappears (10k+ pairs from structure alone) - PR #4 spec needs amendment for spread availability + dead-tier rephrasing (liquidity, not volume, as P10 boundary) Raw NDJSON under data/experiments/ is gitignored; only script + report committed. Experiment 2 (OpenRouter calibration script, ~$0.001 total): - One-shot validation that Gemini Flash V2 strict prompt actually produces schema-conforming JSON with verbatim grounding - Requires OPENROUTER_API_KEY at runtime; not run yet - Will calibrate the $0.00009/call estimate in PR #6 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent c7e0ec3 commit 79077aa

3 files changed

Lines changed: 743 additions & 0 deletions

File tree

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Gamma 分布 + 结构化关系 实验报告(2026-05-12T06:16:23.751849+00:00)
2+
3+
**来源**`scripts/experiment_gamma_distribution.py` 一次性实验,不是 DS pkg #02 的最终实现。
4+
**用途**:回填 §9 Q1(长尾 tier 阈值)+ T4 $0 corpus 可行性验证 + T2 fixture 数据。
5+
6+
---
7+
8+
## 1. 基本统计
9+
10+
- 拉取总市场数:2000
11+
- 通过过滤的活跃市场:2000
12+
(active=true & enableOrderBook!=false & acceptingOrders!=false & closed!=true)
13+
14+
## 2. 总体分布百分位(活跃市场)
15+
16+
- **volume24hr**:P5=$0 | P10=$0 | P25=$0 | P50=$40 | P75=$1,483 | P90=$18,333 | P95=$46,489 | P99=$471,419
17+
- **volume1wk**:P5=$0 | P10=$0 | P25=$0 | P50=$379 | P75=$12,730 | P90=$185,724 | P95=$435,886 | P99=$1,485,768
18+
- **liquidity**:P5=$346 | P10=$787 | P25=$2,634 | P50=$10,138 | P75=$29,733 | P90=$221,690 | P95=$940,321 | P99=$6,003,525
19+
- **spread** (n=2000): P5=0.0010 | P10=0.0010 | P25=0.0010 | P50=0.0100 | P75=0.0300 | P90=0.1001 | P95=0.1900 | P99=0.7101
20+
21+
## 3. 距 resolution 分布
22+
23+
- `<7d`: 18 (0.9%)
24+
- `7-14d`: 16 (0.8%)
25+
- `14-30d`: 326 (16.3%)
26+
- `30-90d`: 367 (18.4%)
27+
- `90-180d`: 152 (7.6%)
28+
- `>180d`: 907 (45.4%)
29+
- `expired`: 54 (2.7%)
30+
- `unknown`: 160 (8.0%)
31+
32+
## 4. Top 10 series(Gamma 无 first-class category,用 `events[0].series` 代理)
33+
34+
- `untagged`: 1499 (75.0%)
35+
- `yearly-ipos`: 22 (1.1%)
36+
- `trump-countries-visited`: 18 (0.9%)
37+
- `trump-monthly-meeting`: 16 (0.8%)
38+
- `trump-trade-deal-countries`: 16 (0.8%)
39+
- `top-ai-company-style-on`: 15 (0.8%)
40+
- `best-ai-company`: 15 (0.8%)
41+
- `second-best-ai-company`: 15 (0.8%)
42+
- `largest-company`: 15 (0.8%)
43+
- `bitcoin-hit-price-monthly`: 15 (0.8%)
44+
45+
## 5. Q1 数据驱动 tier 阈值候选
46+
47+
**直接可填进 §9 Q1 Decision**
48+
49+
- `headline` tier (P90+): volume24hr ≥ $18,333, liquidity ≥ $221,690, spread ≤ 0.0010
50+
- `mid` tier (P50-P90): volume24hr $40–$18,333, liquidity $10,138–$221,690, spread ≤ 0.0100
51+
- `longtail` tier (P10-P50): volume24hr $0–$40, liquidity $787–$10,138, spread ≤ 0.1001
52+
- `dead` tier (<P10): volume24hr ≤ $0, liquidity < $787
53+
54+
**注意**:因 P10 = $0,`longtail``dead` 的边界在 volume24hr 上重合(都从 0 起)。实务建议:用 **liquidity ≥ $791** 区分 longtail 和 dead;vol24hr=0 但 liquidity 在 $791-$10k 区间的市场是真长尾(做市商不来但有底子),vol24hr=0 且 liquidity<$791 才算 dead。
55+
56+
**对比方案初稿**
57+
58+
| Tier | 方案初稿 | 数据驱动 (实测) | 差距 |
59+
|---|---|---|---|
60+
| headline (vol24h) | ≥ $50,000 | ≥ $18,333 | 初稿偏高 |
61+
| mid 下限 (vol24h) | ≥ $5,000 | ≥ $40 | 初稿偏高 |
62+
| longtail 下限 (vol24h) | ≥ $100 | ≥ $0 | 初稿偏高 |
63+
64+
## 6. T4 $0 corpus 可行性验证
65+
66+
- **Neg-risk 组数(≥2 市场)**:171
67+
- **派生 mutex pairs**:10122
68+
- **完全相同 question+endDate 组数**:1
69+
- **派生 equivalent pairs**:1
70+
71+
**T4 $0 corpus 可行**:mutex pairs (10122) ≥ 50 个,足够采样作 T4 judge 校准 ground truth。
72+
73+
## 7. 数据质量警告
74+
75+
- `unknown` 距 resolution 桶 = 160 个市场缺 endDate
76+
- 总数(2000)vs 活跃数(2000)差距 = 0 个被过滤掉
77+
78+
---
79+
80+
*Snapshot: 2026-05-12T06:16:23.751849+00:00, source: gamma-api.polymarket.com/markets*

0 commit comments

Comments
 (0)